My daily workhorse is a M1 Pro that I purchased on release date, It has been one of the best tech purchases I have made, even now it really deals with anything I throw at it. My daily work load is regularly having a Android emulator, iOS simulator and a number of Dockers containers running simultaneously and I never hear the fans, battery life has taken a bit of a hit but it is still very respectable.
I wanted a new personal laptop, and I was debating between a MacBook Air or going for a Framework 13 with Linux. I wanted to lean into learning something new so went with the Framework and I must admit I am regretting it a bit.
The M1 was released back in 2020 and I bought the Ryzen AI 340 which is one of the newest 2025 chips from AMD, so AMD has 5 years of extra development and I had expected them to get close to the M1 in terms of battery efficiency and thermals.
The Ryzen is using a TSMC N4P process compared to the older N5 process, I managed to find a TSMC press release showing the performance/efficiency gains from the newer process: “When compared to N5, N4P offers users a reported +11% performance boost or a 22% reduction in power consumption. Beyond that, N4P can offer users a 6% increase in transistor density over N5”
I am sorely disappointed, using the Framework feels like using an older Intel based Mac. If I open too many tabs in Chrome I can feel the bottom of the laptop getting hot, open a YouTube video and the fans will often spin up.
Why haven’t AMD/Intel been able to catch up? Is x86 just not able to keep up with the ARM architecture? When can we expect a x86 laptop chip to match the M1 in efficiency/thermals?!
To be fair I haven’t tried Windows on the Framework yet it might be my Linux setup being inefficient.
Cheers, Stephen
If you fully load the CPU and calculate how much energy a AI340 needs to perform a fixed workload and compare that to a M1 you'll probably find similar results, but that only matters for your battery life if you're doing things like blender renders, big compiles or gaming.
Take for example this battery life gaming benchmark for an M1 Air: https://www.youtube.com/watch?v=jYSMfRKsmOU. 2.5 hours is about what you'd expect from an x86 laptop, possibly even worse than the fw13 you're comparing here. But turn down the settings so that the M1 CPU and GPU are mostly idle, and bam you get 10+ hours.
Another example would be a ~5 year old mobile qualcomm chip. It's a worse process node than an AMD AI340, much much slower and significantly worse performance per watt, and yet it barely gets hot and sips power.
All that to say: M1 is pretty fast, but the reason the battery life is better has to do with everything other than the CPU cores. That's what AMD and Intel are missing.
> If I open too many tabs in Chrome I can feel the bottom of the laptop getting hot, open a YouTube video and the fans will often spin up.
It's a fairly common issue on Linux to be missing hardware acceleration, especially for video decoding. I've had to enable gpu video decoding on my fw16 and haven't noticed the fans on youtube.
Apple spent years incrementally improving efficiency and performance of their chips for phones. Intel and AMD were more desktop based so power efficiency wasnt the goal. When Apple's chips got so good they could transition into laptops, x86 wasn't in the same ballpark.
Also the iPhone is the most lucrative product of all time (I think) and Apple poured a tonne of that money into R&D and taking the top engineers from Intel, AMD, and ARM, building one of the best silicon teams.
NeXT? But yes, I completely get what you’re saying, I just couldn’t resist. It was an amazingly long sighted strategic move, for sure.
how much silicon did Apple actually create? I thought they outsourced all the components?
You get the ARM ISA, and compilers that work for ARM will compile to Apple Silicon. It's just that the actual hardware you get, is better than the base design, and therefore beats other ARM processors in benchmarks.
https://www.electronicsweekly.com/news/business/finance/arm-...
It is very unlikely Apple uses anything from ARM’s core designs, since that would require paying an additional license fee and Apple was able to design superior cores using its architectural license.
See Lunar Lake on TSMC N3B, 4+4, on-package DRAM versus the M3 on TSMC N3B, 4+4, on-package DRAM: https://youtu.be/ymoiWv9BF7Q?t=531
The 258V (TSMC N3B) has a worse perf / W 1T curve than the Apple M1 (TSMC N5).
Dieselgate?
Also, there's the obvious benefits of being TSMC's best customer. And when you design a chip for low power consumption, that means you've got a higher ceiling when you introduce cooling.
but do not forget how focused they (amd/intel, esp in opteron days -- edit) were on the server market.
Apple is vertically integrated and can optimize at the OS and for many applications they ship with the device.
Compare that to how many cooks are in the kitchen in Wintel land. Perfect example is trying to get to the bottom of why your windows laptop won't go to sleep and cooks itself in your backpack. Unless something's changed, last I checked it was a circular firing squad between laptop manufacturer, Microsoft and various hardware vendors all blaming each other.
> Compare that to how many cooks are in the kitchen in Wintel land. Perfect example is trying to get to the bottom of why your windows laptop won't go to sleep and cooks itself in your backpack
So, I was thinking like this as well, and after I lost my Carbon X1 I felt adventurous, but not too adventurous, and wanted a laptop that "could just work". The thinking was "If Microsoft makes both the hardware and the software, it has to work perfectly fine, right?", so I bit my lip and got a Surface Pro 8.
What a horrible laptop that was, even while I was trialing just running Windows on it. Overheated almost immediately by itself, just idling, and STILL suffers from the issue where the laptop sometimes wake itself while in my backpack, so when I actually needed it, of course it was hot and without battery. I've owned a lot of shit laptops through the years, even some without keys in the keyboard, back when I was dirt-poor, but the Surface Pro 8 is the worst of them all, I regret buying it a lot.
I guess my point is that just because Apple seem really good at the whole "vertically integrated" concept, it isn't magic by itself, and Microsoft continues to fuck up the very same thing, even though they control the entire stack, so you'll still end up with backpack laptops turning themselves on/not turning off properly.
I'd wager you could let Microsoft own every piece of physical material in the world, and they'd still not be able to make a decent laptop.
I've had a few Surface Book 2's for work, and they were fine except: needed more RAM, and there was some issue with connection between screen and base which make USB headsets hinky.
That's why Apple is good at making a whole single system that works by itself, and Microsoft is good at making a system that works with almost everything almost everyone has made almost ever.
> Framework 16
> The 2nd Gen Keyboard retains the same hardware as the 1st Gen but introduces refreshed artwork and updated firmware, which includes a fix to prevent the system from waking while carried in a bag.
I've worked in video delivery for quite a while.
If I were to write the law, decision-makers wilfully forcing software video decoding where hardware is available would be made to sit on these CPUs with their bare buttocks. If that sounds inhumane, then yes, this is the harm they're bringing upon their users, and maybe it's time to stop turning the other cheek.
Are you telling me that for some reason it's not using any hardware acceleration available while watching YouTube? How do I fix it?
All that to say: M1 is pretty fast, but the reason the battery life is better has to do with everything other than the CPU cores. That's what AMD and Intel are missing.
This isn't true. Yes, uncore power consumption is very important but so is CPU load efficiency. The faster the CPU can finish a task, the faster it can go back to sleep, aka race to sleep.Apple Silicon is 2-4x more efficient than AMD and Intel CPUs during load while also having higher top end speed.
Another thing that makes Apple laptops feel way more efficient is that they use a true big.Little design while AMD and Intel's little cores are actually designed for area efficiency and not power efficiency. In the case of Intel, they stuff as many little cores as possible to win MT benchmarks. In real world applications, the little cores are next to useless because most applications prefer a few fast cores over many slow cores.
This is not true. For high-throughput server software x86 is significantly more efficient than Apple Silicon. Apple Silicon optimizes for idle states and x86 optimizes for throughput, which assumes very different use cases. One of the challenges for using x86 in laptops is that the microarchitectures are server-optimized at their heart.
ARM in general does not have the top-end performance of x86 if you are doing any kind of performance engineering. I don't think that is controversial. I'd still much rather have Apple Silicon in my laptop.
For high-throughput server software x86 is significantly more efficient than Apple Silicon.
In the server space, x86 has the highest performance right now. Yes. That's true. That's also because Apple does not make server parts. Look for Qualcomm to try to win the server performance crown in the next few years with their Oryon cores.That said, Graviton is at least 50% of all AWS deployments now. So it's winning vs x86.
ARM in general does not have the top-end performance of x86 if you are doing any kind of performance engineering. I don't think that is controversial.
I think you'll have to define what top-end means and what performance engineering means.This is false, in cross platform tasks it's on par if not worse than latest X86 arches. As others pointed out: 2.5h in gaming is about what you'd expect from a similarly built X86 machine.
They are willing due to lower idle and low load consumption, which they achieve by integrating everything as much as possible - something that's basically impossible for AMD and Intel.
> The faster the CPU can finish a task, the faster it can go back to sleep, aka race to sleep.
May have been true when CPU manufacturers left a ton of headroom on the V/F curve, but not really true anymore. Zen 4 core's power draw shoots up sharply pass 4.6 GHz and nearly triples when you approach 5.5 GHz (compared to 4.6), are you gonna complete the task 3 times faster at 5.5 GHz?
This is false, in cross platform tasks it's on par if not worse than latest X86 arches.
This is Cinebench 2024, a cross platform application: https://imgur.com/a/yvpEpKF They are willing due to lower idle and low load consumption, which they achieve by integrating everything as much as possible - something that's basically impossible for AMD and Intel.
Weird because LNL achieved similar idle wattage as Apple Silicon.[0] Why do you say it's impossible? May have been true when CPU manufacturers left a ton of headroom on the V/F curve, but not really true anymore. Zen 4 core's power draw shoots up sharply pass 4.6 GHz and nearly triples when you approach 5.5 GHz (compared to 4.6), are you gonna complete the task 3 times faster at 5.5 GHz?
Honestly not sure how your statement is relevant.[0]https://www.notebookcheck.net/Dell-XPS-13-9350-laptop-review...
You sure like that table, don't you? Trying to find the source of that blender numbers, I came across many reddit posts of you with that exact same table. Sadly those also don't have a source - the are not from the notebookcheck source.
For Blender numbers, M4 Pro numbers came from Max Tech's review.[0] I don't remember where I got the Strix Halo numbers from. Could have been from another Youtube video or some old Notebookcheck article.
Anyway, Blender has official GPU benchmark numbers now:
M4 Pro: 2497 [1]
Strix Halo: 1304 [2]
So M4 Pro is roughly 90% faster in the latest Blender. The most likely reason for why Blender's official numbers favors M4 Pro even more is because of more recent optimizations.
Sources:
[0]https://youtu.be/0aLg_a9yrZk?si=NKcx3cl0NVdn4bwk&t=325
[1] https://opendata.blender.org/devices/Apple%20M4%20Pro%20(GPU...
[2] https://opendata.blender.org/devices/AMD%20Radeon%208060S%20...
Here is M4 Max CPU https://opendata.blender.org/devices/Apple%20M4%20Max/ - median score 475
Ryzen MAX+ PRO 395 shows median score 448 (can't link because the site does not seem to cope well with + or / in product names)
Resulting in M4 winning by 6%
Weren't we comparing CPUs though? Those Blender benchmarks are for GPUs.
Yes, but I was asked about Blender GPU.Blender CPU tasks are highly parallel. AMD's Ryzen Max 395 has great MT performance. It's generally 5-20% slower in CPU MT than the M4 Max depending on the application.
And where is LNL now? How's the company that produced it? Even under Pat Gelsinger they said that LNL is a one off and they're not gonna make any more of them. It's commercially infeasible.
> Honestly not sure how your statement is relevant.
How is you bringing up synthetics relevant to race to idle?
Regardless, a number of things can be done on Strix Halo to improve the performance, first would be switching to some optimized Linux distro, or at least the kernel. That would claw back 5-20% depending on the task. It would also improve single core efficiency, I've seen my 7945hx drop from 14-15w idle on Windows to about 7-8 on Linux, because Windows likes to jerk off the CCDs non stop and throw the tasks around willy nilly which causes the second CCD and I/O die to never properly idle.
And where is LNL now? How's the company that produced it? Even under Pat Gelsinger they said that LNL is a one off and they're not gonna make any more of them. It's commercially infeasible.
Why does it matter that LNL is bad economically? LNL shows that it's definitely possible to achieve same idle or even better idle wattage than Apple Silicon. How is you bringing up synthetics relevant to race to idle?
I truly don't understand what you mean.A good demonstration is the Android kernel. By far the biggest difference between it and the stock Linux kernel is power management. Many subsystems down to the process scheduler are modified and tuned to improve battery life.
What are some examples of power draw savings that Linux is leaving on the table?
“Modern Standby” could be made to actually work, ACPI states could be fixed, a functional wake-up state built anew, etc. Hell, while it would allow pared down CPUs, you could have a stop-gap where run mode was customized in firmware.
Too much credit is given to Apple for “owning the stack” and too little attention to legacy x86 cruft that allows you to run classic Doom and Commander Keen on modern machines.
Where do you get this from? I could understand that they could get rid of the die area devoted to x86 decoding, but as I understand it x86 and x86-64 instructions get interpreted by the same execution units, which are bitness blind. What makes you think it's x86 support that's responsible for the vast majority of power inefficiency in x86-64 processors?
Reduced I-Cache, uop cache, and decoder pressure would also have a beneficial impact. On the flip side, APX instructions would all be an entire byte longer than their AMD64 counterparts, so some of the benefits would be more muted than they might first appear and optimizing between 16 registers and shorter instructions vs 32 registers with longer instructions is yet another tradeoff for compilers to make (and takes another step down the path of being completely unoptimizable by humans).
Sure, but the topic is optimizing power efficiency by removing support for an instruction set. That aside, if an instruction isn't very performant, it isn't much of an issue per se. It just means it won't get used much and so chip design resources will be suboptimally allocated. That's a problem for Intel and AMD, and for nobody else.
>Operating systems need to carry the baggage in x86 if they want to allow users to run on old and new processors.
What do you mean by this exactly? Are you talking about hybrid execution like WOW64, or simple multi-platform support like the Linux kernel?
WOW64 is irrelevant as far as power efficiency is concerned if the user doesn't run any x86 software. If the user is running x86 software, that's a reason not to remove that support.
Multi-platform support shouldn't have an effect on power efficiency, beyond complicating the design of the system. Saying that the Linux kernel should stop supporting x86 so x86-64 can be more power-efficient is like saying that it should stop supporting... whatever, PowerPC, for that same reason. It's a non sequitur.
I'm confused, how is any of this related to "x86" and not the diverse array of third party hardware and software built with varying degrees of competence?
To be fair, usually the linux itself has hardware acceleration available but the browser vendors tend to disable gpu rendering except on controlled/known perfectly working combinations of OS/Hardware/Drivers and they have much less testing in Linux. In most case you can force enabling gpu rendering in about:config and try it out yourself and leave it unless you get recurring crashes.
All the Blink-based ones just work as long the proper libraries are installed and said libraries properly detect hardware support.
Incredible discipline. The Chrome graph in comparison was a mess.
Looks like general purpose CPUs are on the losing train.
Maybe Intel should invent desktop+mobile OS and design bespoke chips for those.
I assume this is referring to the tweet from the launch of the M1 showing off that retaining and releasing an NSObject is like 3x faster. That's more of a general case of the ARM ISA being a better fit for modern software than x86, not some specific optimization for Apple's software.
x86 was designed long before desktops had multi-core processors and out-of-order execution, so for backwards compatibility reasons the architecture severely restricts how the processor is allowed to reorder memory operations. ARM was designed later, and requires software to explicitly request synchronization of memory operations where it's needed, which is much more performant and a closer match for the expectations of modern software, particularly post-C/C++11 (which have a weak memory model at the language level).
Reference counting operations are simple atomic increments and decrements, and when your software uses these operations heavily (like Apple's does), it can benefit significantly from running on hardware with a weak memory model.
It seems if you want optimal performance and power efficiency, you need to own both hardware and software.
Does Apple optimize the OS for its chips and vice versa? Yes. However, Apple Silicon hardware is just that good and that far ahead of x86.Here's an M4 Max running macOS running Parallels running Windows when compared to the fastest AMD laptop chip: https://browser.geekbench.com/v6/cpu/compare/13494385?baseli...
M4 Max is still faster even with 14 out of 16 possible cores being used. You can't chalk that up to optimizations anymore because Windows has no Apple Silicon optimizations.
Wouldn't it be easier for Intel to heavily modify the linus kernel instead of writing their own stack?
They could even go as far as writing the sleep utilities for laptops, or even their own window manager to take advantage of the specific mods in the ISA?
If it hadn't been killed, it may have become something interesting today.
Or, contribute efficiency updates to popular open projects like firefox, chromium, etc...
it least on mobile platform apple advocate the other way with race to sleep - do calculation as fast as you can with powerful cores so that whole chip can go back to sleep earlier and more often take naps.
But when Apple says it, software devs actually listen.
The other aspect of it is that paid software is more prevalent in macOS land, and the prices are generally higher than on Windows. But the flip side of that is that user feedback is taken more seriously.
Like, would I prefer an older-style Macbook overall, with an integrated card reader, HDMI port, ethernet jack, all that? Yeah, sure. But to get that now I have to go to a PC laptop and there's so many compromises there. The battery life isn't even in the same zip code as a Mac, they're much heavier, the chips run hot even just doing web browsing let alone any actual work, and they CREAK. Like my god I don't remember the last time I had a Windows laptop open and it wasn't making all manner of creaks and groans and squeaks.
The last one would be solved I guess if you went for something super high end, or at least I hope it would be, but I dunno if I'm dropping $3k+ either way I'd just as soon stay with the Macbook.
Modern MacBook pros have 2/3 (card reader and HDMI port), and they brought back my beloved MagSafe charging.
But yeah IMHO there's just no comparison. Unless you're one of those folks who simply cannot fucking stand Mac, it's just no contest.
AMD kind of has, the "Max 395+" is (within 5% margin or so) pretty close to M4 Pro, on both performance and energy use. (it's in the 'Framework Desktop', for example, but not in their laptop lineup yet)
AMD/Intel hasn't surpassed Apple yet (there's no answer for the M4 Max / M3 Ultra, without exploding the energy use on the AMD/Intel side), but AMD does at least have a comparable and competitive offering.
That said Hardware Canucks did a review of the 395 in a mobile form factor (Asus ROG Flow F13) with TDP at 70w (lower than the max 120w TDP you see in desktop reviews). This lower-than-max TDP also gets you closer to the perf/watt sweet spot.
The M4 Pro scores slightly higher in Cinebench R24 despite being 10P+4E vs a full 16P cores on the 395 all while using something like a 30% less power. M4 Pro scores nearly 35% higher than the single-core R24 benchmark too. 395 GPU performance is than M4 Pro in productivity software. More specifically, they trade blows based on which is more optimized in a particular app, but AMD GPUs have way more optimizations in general and gaming should be much better with an x86 + AMD GPU vs Rosetta 2 + GPU translation layers + Wine/crossover.
M4 Pro gets around 50% better battery life for tasks like web browsing when accounting for battery size differences and more than double the battery life per watt/hr when doing something simple like playing a video. Battery life under full load is a bit better for the 395, but doing the math, this definitely involves the 395 throttling significantly down from it's 70w TDP.
Some Chinese companies have also announced laptops with it coming out soon.
Right now, AMD is not even in the ballpark.
In fact, the real kick in the 'nads was my fully kitted M4 laptop outperforming the AMD. I just gave up.
I'll keep checking in with AMD and Intel every generation though. It's gotta change at some point.
I'm working in IT and I get all new machines for our company over my desk to check them, and I observed the exact same points as the OP.
The new machines are either fast and loud and hot and with poor battery life, or they are slow and "warm" and have moderate battery life.
But I had no business laptop yet, ARM, AMD, or Intel, which can even compete with the M1 Air, not to speak of the M3 Pro! Not to speak about all the issues with crappy Lenovo docks, etc.
It doesn’t matter if I install Linux or Windows. The funny point is that some of my colleagues have ordered a MacBook Air or Pro and use their Windows or Linux and a virtual machine via Parallels.
Think about it: Windows 11 or Linux in a VM is even faster, snappier, more silent, and has even longer battery life than these systems native on a business machine from Lenovo, HP, or Dell.
Well, your mileage may vary, but IMHO there is no alternative to a Mac nowadays, even if you want to use Linux or Windows.
With Nvidia Now I can even play games on it, though I wouldn't recommend it for any serious gamers.
And at the shop we are doing technology refreshes for the whole dev team upgrading them to M4s. I was asked if I wanted to upgrade my M1 Pro to an M4, and I said no. Mainly because I don't want to have to move my tooling over to a new machine, but I am not bottlenecked by anything on my current M1.
I need to point this out all the time these days it seems, but this opinion is only valid if all you use is a laptop and all you care about is single core perfomance.
The computing world is far bigger than just laptops.
Big music/3d design/video editing production suites etc still benefit much more from having workstation PCs with higher PCI bandwidth, more lanes for multiple SSDs and GPUs, and high level multicore processing performance which cannot be matched by Apple silicon.
For studio movies, render farms are usually Linux but I think many workstation tasks are done on Apple machines. Or is that no longer true?
Add to that the fact that most of the audio interfaces were firewire and plug and play on mac and a real struggle on windows. With windows you also had to deal with ASIO, and once you picked your audio interface it has to be used for both inputs and outputs (still to this day) forcing you to compound interfaces with workarounds like Asio4All if you wanted to use different interfaces, while Mac os just lets you pick different interfaces for input and output
Linux had very interesting projects, unfortunately music production relies on a lot of expensive audio plugins that a lot of time come in installers and are a pain in the butt to use through proton/wine, when it's possible at all. That means that doing music production on Linux means possibly not using plugins you paid and not finding alternatives to them. It's a shame because I'd love to be able to only use Linux
When I was at music college doing production courses, they exclusively taught Cubase on windows.
Also Protools was available on Windows from 1997 and was used in many PC based studios.
You're right about protools on Windows. I got confused about protools not requiring the use of their own interfaces
On the video side Vegas Pro is used in a lot of production houses, and it does not run on Apple Silicon at all.
I guess I'd slightly change that to "MacBook" or similar, as Apple are top-in-class when it comes to laptops, but for desktop they seem to not even be in the fight anymore, unless reducing power consumption is your top concern. But if you're aiming for "performance per money spent", there isn't really any alternative to non-Apple hardware.
I do agree they do the best hardware in terms of feeling though, which is important for laptops. But computing is so much larger than laptops, especially if you're always working in the same place everyday (like me).
* Apple has had decades optimizing its software and hardware stacks to the demands of its majority users, whereas Intel and AMD have to optimize for a much broader scope of use cases.
* Apple was willing to throw out legacy support on a regular basis. Intel and AMD, by comparison, are still expected to run code written for DOS or specific extensions in major Enterprises, which adds to complexity and cost
* The “standard” of x86 (and demand for newly-bolted-on extensions) means effort into optimizations for efficiency or performance meet diminishing returns fairly quickly. The maturity of the platform also means the “easy” gains are long gone/already done, and so it’s a matter of edge cases and smaller tweaks rather than comprehensive redesigns.
* Software in x86 world is not optimized, broadly, because it doesn’t have to be. The demoscene shows what can be achieved in tight performance envelopes, but software companies have never had reason to optimize code or performance when next year has always promised more cores or more GHz.
It boils down to comparing two different products and asking why they can’t be the same. Apple’s hardware is purpose-built for its userbase, operating systems, and software; x86 is not, and never has been. Those of us who remember the 80s and 90s of SPARC/POWER/Itanium/etc recall that specialty designs often performed better than generalist ones in their specialties, but lacked compatibility as a result.
The Apple ARM vs Intel/AMD x86 is the same thing.
Apple also has a particular advantage in owning the os and having the ability to force independent developers to upgrade their software, which make incompatible updates (including perf optimizations) possible.
And now that lack of courage and unwillingness to see a new design cannibalise their legacy product will be their downfall.
They have the engineers and not create a new super optimized architecture and not the shitshow that Itanium was?
They could also just sit down with Microsoft and say "Right, we're going to go in an entirely different direction, and provide you with something absolutely mind-blowing, but we're going to have to do software emulation for backward compatibility and that will suck for a while until things get recompiled, or it'll suck forever if they never do".
Apple did this twice in the last 20 years - once on the move from PowerPC chips to Intel, and again from Intel to Apple Silicon.
If Microsoft and enough large OEMs (Dell, etc.), thought there was enough juice in the new proposed architecture to cause a major redevelopment of everything from mobile to data centre level compute, they'd line right up, because they know that if you can significantly reduce the amount of power consumption while smashing benchmarks, there are going to long, long wait times for that hardware and software, and its pay day for everyone.
We now know so much more about processor design, instruction set and compiler design than we did when the x86 was shaping up, it seems obvious to me that:
1. RISC is a proven entity worth investing in
2. SoC & SiP is a proven entity worth investing in
3. Customers love better power/performance curves at every level from the device in their pocket to the racks in data centres
4. Intel is in real trouble if they are seriously considering the US government owning actual equity, albeit proposed as non-voting, non-controlling
Intel can keep the x86 line around if they want, but their R&D needs to be chasing where the market is heading - and fast - while bringing the rest of the chain along with them.
For an example of why this doesnt work, see 'Intel Itanium'.
The alternative is death - they do nothing, they're going to die.
Which option do you think they should take?
Thats a subjective opinion. Plenty of people still value higher power multi core chips over apple silicon, because they are still better at doing real work. I dont think they need to go in a new direction personally, but I was just showing an example of why your provided solution is not a silver bullet.
Each time they had a pretty good emulation story to keep most stuff (certainly popular stuff) working through a multi-year transition period.
IMO, this is better then carrying around 40 years of cruft.
> IMO, this is better then carrying around 40 years of cruft.
Backwards compatibility is such a strong point, it is why windows survives even though it has become a bloated ad riddled mess. You can argue which is better, but that seriously depends on your requirements. If you have a business application coded 30 years ago on x86 that no developer in your company understands any more, then backwards compatibility is king. On the other end of the spectrum if you are happy to be purchasing new software subscriptions constantly and having bleeding edge hardware is a must for you, then backwards compatibility probably isnt required.
"oh a post about Apple, let me come in and share my hatred for Apple again by outright lying!"
As stated already, macOS 26 runs on the M1 and even the 2019 Macbook Pro. So i think i know where you got the "3 new versions" figure, and it's a dark and smelly place.
However My parents 2017 Macbook pro can only upgrade to Ventura, which is a 2022 release. 5 years and that $2.5k baby was obselete. However rude you are about your defense of Apple, 5-6 years until software starts being unable to install is pretty shitty. I use 30 year old apps daily on windows with no issue.
Looks like defending Apple is the smelly place to be judging by your tone and condescending snark.
But as you mention - they've at multiple times changed the underlying architecture, which surely would render å large part of prior optimizations obsolete?
> Software in x86 world is not optimized, broadly, because it doesn’t have to be.
Do ARM software need optimization more than x86?
This is why I get so livid regarding Electron apps on the Mac.
I’m never surprised by developer-centric apps like Docker Desktop — those inclined to work on highly technical apps tend not to care much about UX — but to see billion-dollar teams like Slack and 1Password indulge in this slop is so disheartening.
Second, the x86 platform has a lot of legacy, and each operation on x86 is translated from an x86 instruction into RISC-like micro-ops. This is an inherent penalty that Apple doesn't have pay, and it is also why Rosetta 2 can achieve "near native" x86 performance; both platform translate the x86 instructions.
Third, there are some architectural differences even if the instruction decoding steps are removed from the discussion. Apple Silicon has a huge out-of-order buffer, and it's 8-wide vs x86 4-wide. From there, the actual logic is different, the design is different, and the packaging is different. AMD's Ryzen AI Max 300 series does get close to Apple by using many of the same techniques like unified memory and tossing everything onto the package, where it does lose is due to all of the other differences.
In the end, if people want crazy efficiency Apple is a great answer and delivers solid performance. If people want the absolute highest performance, then something like Ryzen Threadripper, EPYC, or even the higher-end consumer AMD chips are great choices.
1) Apple Silicon outperforms all laptop CPUs in the same power envelope on 1T on industry-standard tests: it's not predominantly due to "optimizing their software stack". SPECint, SPECfp, Geekbench, Cinebench, etc. all show major improvements.
2) x86 also heavily relies on micro-ops to greatly improve performance. This is not a "penalty" in any sense.
3) x86 is now six-wide, eight-wide, or nine-wide (with asterisks) for decode width on all major Intel & AMD cores. The myth of x86 being stuck on four-wide has been long disproven.
4) Large buffers, L1, L2, L3, caches, etc. are not exclusive to any CPU microarchitecture. Anyone can increase them—the question is, how much does your core benefit from larger cache features?
5) Ryzen AI Max 300 (Strix Halo) gets nowhere near Apple on 1T perf / W and still loses on 1T perf. Strix Halo uses slower CPUs versus the beastly 9950X below:
Fanless iPad M4 P-core SPEC2017 int, fp, geomean: 10.61, 15.58, 12.85 AMD 9950X (Zen5) SPEC2017 int, fp, geomean: 10.14, 15.18, 12.41 Intel 285K (Lion Cove) SPEC2017 int, fp, geomean: 9.81, 12.44, 11.05
Source: https://youtu.be/2jEdpCMD5E8?t=185, https://youtu.be/ymoiWv9BF7Q?t=670
The 9950X & 285K eat 20W+ per core for that 1T perf; the M4 uses ~7W. Apple has a node advantage, but no node on Earth gives you 50% less power.
There is no contest.
From the AMD side it was 4 wide until Zen 5. And now it's still 4 wide, but there is a separate 4-wide decoder for each thread. The micro-op cache can deliver a lot of pre-decoded instructions so the issue width is (I dunno) wider but the decode width is still 4.
2. X86 micro-ops vs ARM decode are not equivalent. X86’s variable length instructions make the whole process far more complicated than it is on something like ARM. This is a penalty due to legacy design.
3. The OP was talking about M1. AFAIK, M4 is now 10-wide, and most x86 is 6-wide (Ryzen 5 does some weird stuff). X86 was 4-wide at the time of M1’s introduction.
4. M1 has over 600 reorder buffer registers… it’s significantly larger than competitors.
5. Close relative to x86 competitors.
3. The claim was never "stuck on 4-wide", but that going wider would incur significant penalties which is the case. AMD uses two 4-wide encoders and pays a big penalty in complexity trying to keep them coherent and occupied. Intel went 6-wide for Golden Cove which is infamous for being the largest and most power-hungry x86 design in a couple decades. This seems to prove the 4-wide people right.
4. This is only partially true. The ISA impacts which designs make sense which then impacts cache size. uop cache can affect L1 I-cache size. Page size and cache line size also affect L1 cache sizes. Target clockspeeds and cache latency also affect which cache sizes are viable.
It's an energy penalty, even if wall clock time improves.
Can we please stop with this myth? Every superscalar processor is doing the exact same thing, converting the ISA into the µops (which may involve fission or fusion) that are actually serviced by the execution units. It doesn't matter if the ISA is x86 or ARM or RISC-V--it's a feature of the superscalar architecture, not the ISA itself.
The only reason that this canard keeps coming out is because the RISC advocates thought that superscalar was impossible to implement for a CISC architecture and x86 proved them wrong, and so instead they pretend that it's only because x86 somehow cheats and converts itself to RISC internally.
Which hasn't even been the case anymore for several years now. Some µOPs in modern x86-64 cores combine memory access with arithmetic operations, making them decidedly non-RISC.
Given that videos spin up those coolers, there is actually a problem with your GPU setup on Linux, and I expect there'd be an improvement if you managed to fix it.
Another thing is that Chrome on Linux tends to consume exorbitant amount of power with all the background processes, inefficient rendering and disk IO, so updating it to one of the latest versions and enabling "memory saving" might help a lot.
Switching to another scheduler, reducing interrupt rate etc. probably help too.
Linux on my current laptop reduced battery time x12 compared to Windows, and a bunch of optimizations like that managed to improve the situation to something like x6, i.e. it's still very bad.
> Is x86 just not able to keep up with the ARM architecture?
Yes and no. x86 is inherently inefficient, and most of the progress over last two decades was about offloading computations to some more advanced and efficient coprocessors. That's how we got GPUs, DMA on M.2 and Ethernet controllers.
That said, it's unlikely that x86 specifically is what wastes your battery. I would rather blame Linux, suspect its CPU frequency/power drivers are misbehaving on some CPUs, and unfortunately have no idea how to fix it.
Nothing in x86 prohibits you from an implementation less efficient than what you could do with ARM instead.
x86 and ARM have historically served very different markets. I think the pattern of efficiency differences of past implementations is better explained by market forces rather than ISA specifics.
Linux can actually meet or even exceed Window's power efficiently, at least at some tasks, but it takes a lot of work to get there. I'd start with powertop and TLP.
As usual, the Arch wiki is a good place to find more information: https://wiki.archlinux.org/title/Power_management
I've used Linux laptops since ~2007, and am well aware of the issues. 12x is well beyond normal.
I don't think I ever saw 50W at all, even under load; they probably run an Ultra U1xxH, permanently turbo-boosted.
For some reason. Given the level of tinkering (with schedulers and interrupt frequencies), it's likely self-imposed at this point, but you never know.
If nothing would be wrong, it'd be at something like 1.5GHz with most of the cores unpowered.
If you're willing to spend a bunch of die area (which directly translates into cost) you can get good numbers on the other two legs of the Power-Performance-Area triangle. The issue is that the market position of Apple's competitors is such that it doesn't make as much sense for them to make such big and expensive chips (particularly CPU cores) in a mobile-friendly power envelope.
What makes Apple silicon chips big is they bolt on a fast GPU on it. If you include the die of a discrete GPU with an x86 chip, it’d be the same or bigger than M series.
You can look at Intel’s Lunar Lake as an example where it’s physically bigger than an M4 but slower in CPU, GPU, NPU and has way worse efficiency.
Another comparison is AMD Strix Halo. Despite being ~1.5x bigger than the M4 Pro, it has worse efficiency, ST performance, and GPU performance. It does have slightly more MT.
Such a decoder is vastly less sophisticated with AArch64.
That is one obvious architectural drawback for power efficiency: a legacy instruction set with variable word length, two FPUs (x87 and SSE), 16-bit compatibility with segmented memory, and hundreds of otherwise unused opcodes.
How much legacy must Apple implement? Non-kernel AArch32 and Thumb2?
Edit: think about it... R4000 was the first 64-bit MIPS in 1991. AMD64 was introduced in 2000.
AArch64 emerged in 2011, and in taking their time, the designers avoided the mistakes made by others.
How much that does for efficiency I can't say, but I imagine it helps, especially given just how damn easy it is to decode.
"In Anandtech’s interview, Jim Keller noted that both x86 and ARM both added features over time as software demands evolved. Both got cleaned up a bit when they went 64-bit, but remain old instruction sets that have seen years of iteration."
I still say that x86 must run two FPUs all the time, and that has to cost some power (AMD must run three - it also has 3dNow).
Intel really couldn't resist adding instructions with each new chip (MMX, PAE for 32-bit, many more on this shorthand list that I don't know), which are now mostly baggage.
Legacy floating-point and SIMD instructions exposed by the ISA (and extensions to it) don't have any bearing on how the hardware works internally.
Additionally, AMD processors haven't supported 3DNow! in over a decade -- K10 was the last processor family to support it.
Where are you getting M4 die sizes from?
It would hardly be surprising given the Max+ 395 has more, and on average, better cores fabbed with 5nm unlike the M4's 3nm. Die size is mostly GPU though.
Looking at some benchmarks:
> slightly more MT.
AMD's multicore passmark score is more than 40% higher.
https://www.cpubenchmark.net/compare/6345vs6403/Apple-M4-Pro...
> worse efficiency
The AMD is an older fab process and does not have P/E cores. What are you measuring?
> worse ST performance
The P/E design choice gives different trade-offs e.g. AMD has much higher average single core perf.
> worse GPU performance
The AMD GPU:
14.8 TFLOPS vs. M4 Pro 9.2 TFLOPS.
19% higher 3D Mark
34% higher GeekBench 6 OpenCL
Although a much crappier Blender score. I wonder what that's about.
https://nanoreview.net/en/gpu-compare/radeon-8060s-vs-apple-...
Where are you getting M4 die sizes from?
M1 Pro is ~250mm2. M4 Pro likely increased in size a bit. So I estimated 300mm2. There are no official measurements but should be directionally correct. AMD's multicore passmark score is more than 40% higher.
It's an out of date benchmark that not even AMD endorses and the industry does not use. Meanwhile, AMD officially endorses Cinebench 2024 and Geekbench. Let's use those. The AMD is an older fab process and does not have P/E cores. What are you measuring?
Efficiency. Fab process does not account for the 3.65x efficiency deficit. N4 to N3 is roughly ~20-25% more efficient at the same speed. The P/E design choice gives different trade-offs e.g. AMD has much higher average single core perf.
Citation needed. Further more, macOS uses P cores for all the important tasks and E cores for background tasks. I fail to see why even if AMD has a higher average ST would translate to better experience for users. 14.8 TFLOPS vs. M4 Pro 9.2 TFLOPS.
TFLOPs are not the same between architectures. 19% higher 3D Mark
Equal in 3DMark Wildlife, loses vs M4 Pro in Blender. 34% higher GeekBench 6 OpenCL
OpenCL has long been deprecated on macOS. 105727 is the score for Metal, which is supported by macOS. 15% faster for M4 Pro.The GPUs themselves are roughly equal. However, Strix Halo is still a bigger SoC.
Shouldn't they be the same if we are speaking about same precision? For example, [0] shows M4 Max 17 TFLOPS FP32 vs MAX+ 395 29.7 TPLOFS FP32 - not sure what exact operation was measured but at least it should be the same operation. Hard to make definitive statements without access to both machines.
[0] https://www.cpu-monkey.com/en/compare_cpu-apple_m4_max_16_cp...
TFLOPS can't be measured the same between generations. For example, Nvidia often quotes sparsity TFLOPS which doubles the dense TFLOPS previously reported. I think AMD probably does the same for consumer GPUs.
Another example is Radeon RX Vega 64 which had 12.7 TFLOPS FP32. Yet, Radeon RX 5700 XT with just 9.8 TFLOPS FP32 absolutely destroyed it in gaming.
"directionally correct"... so you don't know and made up some numbers? Great.
AMD doesn't "endorse benchmarks" especially not fucking Geekbench for multi-core. No-one could because it's famously nonsense for higher core counts. AMD's decade old beef with Sysmark was about pro-Intel bias.
"directionally correct"... so you don't know and made up some numbers? Great.
I never said it was exactly that size. Apple keeps the sizes of their base, Pro, and Max chips fairly consistent over generations.Welcome to the world of chip discussions. I've never taken apart and M4 Pro computer and measured the die myself. It appears no one has on the internet. However, we can infer a lot of it based on previously known facts. In this case, we know M1 Pro's die size is around 250mm2.
AMD doesn't "endorse benchmarks" especially not fucking Geekbench for multi-core. No-one could because it's famously nonsense for higher core counts. AMD's decade old beef with Sysmark was about pro-Intel bias.
Geekbench is the main benchmark AMD tends to use: https://videocardz.com/newz/amd-ryzen-5-7600x-has-already-be...The reason is because Geekbench correlates highly with SPEC, which is the industry standard.
That three-year old press-release refers to SINGLE CORE Geekbench and not the defective multicore version that doesn't scale with core counts. Given AMD's main USP is core counts it would be an... unusual choice.
AMD marketing uses every other product under the sun too (no doubt whatever gives the better looking numbers)... including Passmark e.g. it's on this Halo Strix page:
https://www.amd.com/en/products/processors/ai-pc-portfolio-l...
So I guess that means Passmark is "endorsed" by AMD too eh? Neat.
The standard is SPEC, which correlates with with Geekbench.
https://medium.com/silicon-reimagined/performance-delivered-...
Every time there is a discussion on Apple Silicon, some uninformed person always brings up Passmark, which is completely outdated.
What's with posting 5 year old medium articles about a different version of Geekbench? Geekbench 5 had different multicore scaling so if you want to argue that version was so great then you are also arguing against Geekbench 6 because they don't even match.
https://www.servethehome.com/a-reminder-that-geekbench-6-is-...
"AMD Ryzen Threadripper 3995WX, a huge 64 core/ 128 thread part, was performing at only 3-4x the rate of an Intel D-1718T quad-core part, even despite the fact it had 16x the core count and lots of other features."
"With the transition from Geekbench 5 to Geekbench 6, the focus of the Primate Labs team shifted to smaller CPUs"
Before the M1, I was stuck using an intel core i5 running arch linux. My intel mac managed to die months before the M1 came out. Let's just say that the M1 really made me appreciate how stupidly slow that intel hardware is. I was losing lots of time doing builds. The laptop would be unusable during those builds.
Life is too short for crappy hardware. From a software point of view, I could live with Linux but not with Windows. But the hardware is a show stopper currently. I need something that runs cool and yet does not compromise on performance. And all the rest (non-crappy trackpad, amazingly good screen, cool to the touch, good battery life, etc.). And manages to look good too. I'm not aware of any windows/linux laptop that does not heavily compromise on at least a few of those things. I'm pretty sure I can get a fast laptop. But it'd be hot and loud and have the unusable synaptics trackpad. And a mediocre screen. Etc. In short, I'd be missing my mac.
Apple is showing some confidence by just designing a laptop that isn't even close to being cheap. This thing was well over 4K euros. Worth every penny. There aren't a lot of intel/amd laptops in that price class. Too much penny pinching happening in that world. People think nothing of buying a really expensive car to commute to work. But they'll cut on the thing that they use the whole day when they get there. That makes no sense whatsoever in my view.
I’ve actually been debating moving from the Pro to the Air. The M4 is about on par with the M1 Pro for a lot of things. But it’s not that much smaller, so I’d be getting a lateral performance move and losing ports, so I’m going to wait and see what the future holds.
I've been window shopping for a couple of months now, have test run Linux and really liking the experience there (played on older Intel hardware). I am completely de-appled software-wise, with the 1 exception of iMessages because of my kids using ipads. But that's really about it. So, I'm ready to jump.
But so far, all my research hasn't lead to anything where I would be convinced not to regret in the end. A desktop Ryzen 7700 or 9600X would probably suffice, but it would mean I need to constantly switch machines and I'm not sure if I'm ready for that. All mobile non-macs have significant downsides and you can't even try before you buy anywhere typically. So you'd be relying on reviews. But everybody has a different tolerance for changes like track pad haptics, thermals, noise, screen quality etc. So, those reviews don't give enough confidence. I've had 13 Apple years so far. First 5 were pleasant, next 3 really sucked but since Apple silicon I feel I have totally forgotten all the suffering in the non-Apple world and with those noisy, slow Intel Macs.
I think it has to boil down to serious reasons why the Apple hardware is not fit for one's purpose. Be it better gaming, extreme amount of storage, insane amount of RAM, all while ignoring the value of "the perfect package" and it's low power draw, low noise etc. Something that does not make one regret the change. DHH has done it and so have others, but he switched to Framework Desktop AI Max. So it came with a change in lifestyle. And he also does gaming, that's another good reason (to switch to Linux or dual boot (as he mentioned Fortnite)).
I don't have such reasons currently. Unless we see hardware that is at least as fast and enjoyable like the M1 Pro or higher. I tried Asahi but it's quite cumbersome with the dual boot and also DP Alt not there yet and maybe never will, so I gave up on that.
So, I'll wait another year and will see then. I hope I don't get my company to buy me an M4 Max Ultra or so as that will ruin my desire to switch for 10 more years I guess.
Yeah, those glossy mirror-like displays in which you see yourself much better than the displayed content are polished really well
I’ll take the apple display any day. It’s bright enough to blast through any reflections.
Hah, it's exactly the other way around for me; I can't stand Apple's hardware. But then again I never bought anything Asus... let alone gamer laptops.
Not sure why they can follow ANSI in the US but not ISO here. I just have to override the layout and ignore the symbols.
Apple on the other hand doesn't offer such machines... actually never has. To me, prizing maintainability, expandability, modularity, etc., their laptops are completely undesireable even within the confines of their outdated form factor; their efficient performance is largely irrelevant, and their tablets are much too enshittified to warrant consideration. And that's before we get into the OS and eco-system aspects. :)
First two years it was solid, but then weird stuff started happening like the integrated GPU running full throttle at all times and sleep mode meaning "high temperature and fans spinning to do exactly nothing" (that seems to be a Windows problem because my work machine does the same).
Meanwhile the manufacturer, having released a new model, lost interest, so no firmware updates to address those issues.
I currently have the Framework 16 and I'm happy with it, but I wouldn't recommend it by default.
I for one bought it because I tend to damage stuff like screens and ports and it also enables me to have unusual arrangements like a left-handed numpad - not exactly mainstream requirements.
Apple is just off the side somewhere else.
Framework does not have the volume, it is optimized for modularity, and the software is not as optimized for the hardware.
As a general purpose computer Apple is impossible to beat and it will take a paradigm shift for that for to change (completely new platform - similar to the introduction of the smart phone). Framework has its place as a specialized device for people who enjoy flexible hardware and custom operating systems.
What about all the money that they make from abusive practices like refusing to integrate with competitors' products thus forcing you to buy their ecosystem, phoning home to run any app, high app store fees even on Mac OS, and their massive anti repair shenanigans?
As for the services - it is a bit off topic as I believe Apple makes a profit on their macs alone ignoring their services business. But in general I have less of a problem with a subscription / fee-driven services business compared to an advertisement-based one. And as for the fee / alternative payment controversy (epic vs apple etc.) this is something that is relevant if you are a big brand that can actually market on your own / build an alternative shop infrastructure. For small time developers the marketing and payment infrastructure the apple app store offers is a bargain.
What i am saying is that Apple could for sure fit replaceable drives without any change hit to size or weight. But their Mac strategy is price based on disk size and make repairs expensive so you buy new machine. I don't complain it is the reason why cheapest Macbook Air is the best laptop deal.
But let's stop this marketing story that it's their engineering genius not their market strategy.
Macbooks are one of the heaviest laptops you can buy. I think they are doing it for the premium feel - it is extremely sturdy.
Yes, because of the metal enclosure while nearly all Windows laptop makers use plastic. Macs are usually the thinnest laptops in their class though.It's also not as robust. But it's definitely thinner and lighter.
I don't think this is even close to true. My last laptop from 2020 weighed at ~2.6kg and it's 2025 counterpart is still at 2.1kg, while my work m1 mac is at 1.3kg
>. I think they are doing it for the premium feel - it is extremely sturdy
It's not merely a feel; I've succesfully thrown it to the pavement more than once from ~1.5 meters and it's continued working well, whereas none of my previous laptops have gotten away scot free before from even one drop
Apple does practice very hard repairability which I agree should be made much more accessible.
So this is precisely what Apple did, and we can argue it was long time in the making. The funny part is that nobody expected x86 to make way for ARM chips, but perhaps this was a corporate bias stemming from Intel marketing, which they are arguably very good at.
Only if all you care about is having a laptop with really fast single core performance. Anything that requires real grunt needs a workstation or server which Apple silicon connot provide.
There's probably a lot still missing: Apple integrated the memory on the same die, and built Metal for software to directly take advantage of that design. That's the competitive advantage of vertical integration.
It's on the same package but not the same die
It's made worse on the Strix Halo platform, because it's a performance first design, so there's more resource for Chrome to take advantage of.
The closest browser to Safari that works on Linux is Falkon. It's compatability is even less than Safari, so there's a lot of sites where you can't use it, but on the ones where you can, your battery usage can be an order of magnitude less.
I recommend using Thorium instead of Chrome; it's better but it's still Chromium under the hood, so it doesn't save much power. I use it on pages that refuse to work on anything other than Chromium.
Chrome doesn't let you suspend tabs, and as far as I could find there aren't any plugins to do so; it just kills the process when there aren't enough resources and reloads the page when you return to it. Linux does have the ability to suspend processes, and you can save a lot of battery life, if you suspend Chrome when you aren't using it.
I don't know of any GUI for it, although most window managers make it easy to assign a keyboard shortcut to a command. Whenever you aren't using Chrome but don't want to deal with closing it and re-opening it, run the following command (and ignore the name, it doesn't kill the process):
killall -STOP google-chrome
When you want to go back to using it, run: killall -CONT google-chrome
This works for any application, and the RAM usage will remain the same while suspended, but it won't draw power reading from or writing to RAM, and its CPU usage will drop to zero. The windows will remain open, and the window manager will handle them normally, but whats inside won't update, and clicks won't do anything until resumed.https://birchtree.me/blog/everyone-says-chrome-devastates-ma...
That might be different on other platforms
notebookcheck.com does pretty comprehensive battery and power efficiency testing - not of every single device, but they usually include a pretty good sample of the popular options.
Most Linux distributions are not well tuned, because this is too device-specific. Spending a few minutes writing custom udev rules, with the aid of powertop, can reduce heat and power usage dramatically. Another factor is Safari, which is significantly more efficient than Firefox and Chromium. To counter that, using a barebones setup with few running services can get you quite far. I can get more than 10 hours of battery from a recent ThinkPad.
The entire point here is that you can run whatever the hell you want on Apples stuff without breaking a sweat. I shouldn’t have to counter shit.
These are the kinds of optimizations that macOS does out of the box and you cannot expect most Linux users to do (which is one of the reasons battery life is so bad on Linux out-of-the-box).
Its also probably worth putting the laptop in "efficiency" mode (15W sustained, 25W boost per Framework). The difference in performance should be fairly negligible compared to balanced mode for most tasks and it will use less energy.
"LPDDR5x-8000"
I agree that it's unfortunate that the power usage isn't better tuned out of the box. An especially annoying aspect of GNOME's "Power Saver" mode is that it disables automatic software updates, so you can't have both automatic updates and efficient power usage at the same time (AFAIK)
The cost is flexibility and I think for now they don't want to move to fixed RAM configurations. The X3D approach from AMD gets a good bunch of the benefits by just putting lots of cache on board.
Apple got a lot of performance out of not a lot of watts.
One other possibility on power saving is the way Apple ramps the clockspeed. Its quite slow to increase from its 1Ghz idle to 3.2Ghz, about 100ms and it doesn't even start for 40ms. With tiny little bursts of activity like web browsing and such this slow transition likely saves a lot of power at a cost of absolute responsiveness.
No, it's not. DRAM latency on Apple Silicon is significantly higher than on the desktop, mainly because they use LPDDR which has higher latencies.
Source: chipsandcheese.com memory latency graphs
Not necessarily. Running longer at a slower speed may consume more energy overall, which is why "race to sleep" is a thing. Ideally the clock would be completely stopped most of the time. I suspect it's just because Apple are more familiar with their own SoC design and have optimised the frequency control to work with their software.
On package memory increases efficiency, not speed.
However, most of the speed and efficiency advantages are in the design.
I don't think many people have appreciated just how big a change the 64 bit Arm was, to the point it's basically a completely different beast than what came before.
From the moment the iPhone went 64 bit it was clear this was the plan the whole time.
Windows does a lot of useless crap in the background that kills battery and slows down user-launched software
One of the things Apple has done is to create a wider core that completes more instructions per clock cycle for performance while running those cores at conservative clock speeds for power efficiency.
Intel and AMD have been getting more performance by jacking up the clock speeds as high as possible. Doing so always comes at the cost of power draw and heat.
Intel's Lunar Lake has a reputation for much improved battery life, but also reduces the base clock speed to around 2 gigahertz.
The performance isn't great vs the massively overclocked versions, but at least you get decent battery life.
How many iterations to match Apple?
In day to day usage the strix halo is significantly faster, and especially when large context LLM and games are used - but also typical stuff like Lightroom (gpu heavy) etc.
on the flip side the m4 battery life is significantly longer (but also the mpb is approx 1/4 heavier)
for what its worth i also have a t14 with a snapdragon X elite and while its battery is closer to a mbp, its just kinda slow and clunky.
so my best machine right now is the x86 actually!
yes and no. i have macbook pro m4 and a zbook g1a (ai max 395+ ie strix halo)
You're comparing the base M4 to a full fat Strix Halo that costs nearly $4,000. You can buy the base M4 chip in a Mac Mini for $500 on sale. A better comparison would be the M4 Max at that price.Here's a comparison I did between Strix Halo, M4 Pro, M4 Max: https://imgur.com/a/yvpEpKF
As you can see, Strix Halo is behind M4 Pro in performance and severely behind in efficiency. In ST, M4 Pro is 3.6x more efficient and 50% faster. It's not even close to the M4 Max.
(but also the mpb is approx 1/4 heavier)
Because it uses a metal enclosure.You don't own any of the machines but have "made" a comparison by copying data from the internet I assume.
This is like explaining to someone who eats a sweet apple that the internet says the apple isn't sweet.
MacBook Pro, 2TB, 32gb, 3200 EUR
HP G1a, 2TB, 128gb, 3700 EUR
If we don't compare laptops but mini-PCs,
Evo X2, 2TB, 128gb, 2000 EUR,
Mac Mini, 2TB, 32gb, 2200 EUR
They’re not arguing against their subjective experience using it, they’re arguing against the comparison point as an objective metric.
If you’re picking analogies, it’s like saying Audis are faster than Mercedes but comparing an R8 against an A class.
2. I'd say apples and oranges is subjective and depends on what is important to you. If you're interested in Vitamin C, apples to oranges is a valid comparison. My interest in comparing this is for running local coding LLMs - and it is difficult to get great results on 24/32gb of Nvidia VRAM (but by far the fastest option/$ if your model fits into a 5090). For models to work with you often need 128gb of RAM, therefor I'd compare a Mac Studio 128gb (cheapest option from Apple for a 128gb RAM machine) with a 395+ (cheapest (only?) option for x86/Linux). So what is apples to oranges to you, makes sense to many other people.
3. Why would you think a 395+ and an M4 Pro are in "a different class"?
They have a MacBook Pro with an M4, not an M4 Pro. That is a wildly different class of SoC from the 395. Unless the 395 is also capable of running in fanless devices too without issue.
For your first point, yes it does matter if the discussion is about objectively trying to understand why things are faster or not. Subjective opinions are fine, but they belong elsewhere. My grandma finds her Intel celeron fast enough for her work, I’m not getting into an argument with her over whether an i9 is faster for the same reason.
Your second point is equally as subjective, and out of place in a discussion about objectively trying to understand what makes the performance difference.
You don't own any of the machines but have "made" a comparison by copying data from the internet I assume.
This is like explaining to someone who eats a sweet apple that the internet says the apple isn't sweet.
Yea, I never said he is wrong in his own experience. I was pointing out that the comparison is made between a base M4 and maxed out Ryzen. If we want to compare products in the same class, then use M4 Max. MacBook Pro, 2TB, 32gb, 3200 EUR
A little disingenuous to max out on the SSD to make the Apple product look worse. SSD prices are bad value on Apple products. No one is denying that.You: "You're comparing the base M4 to a full fat Strix Halo that costs nearly $4,000."
Then
You: "A little disingenuous to max out on the SSD to make the Apple product look worse."
I didn't "max out" the SSD, I chose an SSD to match the machine of the user.
Why don't you try to match in CPU speed, GPU speed, NPU speed, noise, battery life, etc? Why match SSD only?That's why your post was disingenuous.
If it helps you focus on what the actual discussion, we are comparing maximum CPU and GPU speeds for the dollar. That's it.
Max Studio, 128gb, 4400 EUR
And of course, the Mac Studio itself is a much more capable box with things like Thunderbolt5, more ports, quieter, etc.
I can see why some people would choose the AMD solution. It runs x86, works well with Linux, can play DirectX games natively, and is much cheaper.
Meanwhile, the M4 Max performs significantly better, more efficient, likely much more quiet, runs macOS, more ports, better build quality, Apple backing and support.
You: "If it helps you focus on what the actual discussion, we are comparing maximum CPU and GPU speeds for the dollar."
You: "Mac Studio itself is a much more capable box with things like Thunderbolt5, more ports, quieter"
Adding to that, it is very picky about which power brick it accepts (not every 140W PD compliant works) and the one that comes with the laptop is bulky and heavy. I am used to plugging my laptop into whatever USB-C PD adapter is around, down to 20W phone chargers. Having the zbook refuse to charge on them is a big downgrade for me.
It's Dell, they are probably not actually using PD3.1 to achieve the 140w mark, instead they are prolly using PD3.0 extension and shove 20v7a into the laptop. I can't find any info, but you can check on the charger.
If it lists 28V then it's 3.1, else 3.0. If it's 3.1 you can get a Baseus PowerMega 140W PD3.1, seems like a reeeeally solid charger from my limited use.
With some of the other 28V 5A adapters I have, it charges until triggering a compute heavy task and then stops. I have seen reports online of people seeing this behavior with the official adapter. My theory is that the laptop itself does not accept any ripple at all.
Why are you asking me? I'm not in charge of AMD.
Yes the Strix Halo is not as fast on the benchmarks as the M4 Max, its bandwidth is lower, and the max config has less memory. However, it is available in a lot of different configurations and some are much cheaper than comparable M4 systems (e.g. the maxed out Framework desktop is $2000.) It's a tradeoff, as everything in life is. No need to act like such an Apple fanboi.
Why are you asking me? I'm not in charge of AMD.
Because you claimed this so I thought you knew: A few iterations of this should be comparable to the M series
The primary reason is the ST speed (snappy feeling) and the efficiency (no noise, cool, long battery life).
It just so happens that Cinebench 2025 is the only power measurement metric I have available via Notebookcheck. If Notebookcheck did power measurements for GB6, I'd rather use that as it's a better CPU benchmark overall.
Cinebench 2025 is a decent benchmark but not perfect. It does a good enough job of demonstrating why the experience of using Apple Silicon is so much better. If we truly want to measure the CPU architecture like a professional, we would use SPEC and the measure power from the wall.
Until AMD can built a tailor made OS for their chips and build their own laptops.
https://browser.geekbench.com/v6/cpu/compare/13494385?baseli...
M4 Max is still faster. Note that the M4 Max is only given 14 out of 16 cores, likely reserving 2 of them for macOS.
How do you explain this when Windows has zero Apple Silicon optimizations?
GB correlates highly with SPEC. AMD also uses GB in their official marketing slides.
This assumes Apple's M series performance is a static target. It is not. Apple is iterating too.
Also, especially the MacBook Pros have really large batteries, on average larger than the competition. This increases the battery runtime.
Fanless x86 desktops are a thing too, in the form of thin clients and small PCs intended for business use. I have a few HP T630s I use as servers (I have used them as desktop PCs too, but my tab-hoarding habit makes them throttle a bit too much for my use - they'd be fine for a lot of people).
Sure you can, they’re readily available on the market, though not especially common.
But even performance laptops can often be run without spinning their fans up at all. Right now, the ambient temperature where I live is around 28°, and my four-year-old Ryzen 5800HS laptop hasn’t used its fan all day, though for a lot of that time it will have been helped by a ceiling fan. But even away from a fan for the last half hour, it sits in my lap only warm, not hot. It’s easy enough to give it a load it’ll need to spin the fan up for, but you can also limit it so it will never need its fan. (In summer when the ambient temperature is 10°C higher every day, you’ll want to use its fan even when idling, and it’ll be hard to convince it not to spin them up.)
x86-64 devices that are don’t even have fans won’t ever have such powerful CPUs, and historically have always been very underpowered. Like only 60% of my 5800HS’s single-threaded benchmarking and only 20% of its multithreaded. But at under 20% of the peak power consumption.
Apple, unlike a lot, if not all large companies (who are run by MBA beancounter morons), holds insanely large amounts of cash. That is how they can go and buy up entire markets of vendors - CNC mills, TSMC's entire production capacity for a year or two, specialized drills, god knows what else.
They effectively price out all potential competitors at once for years at a time. Even if Microsoft or Samsung would want to compete with Apple and make their own full aluminium cases, LED microdots or whatever - they could not because Apple bought exclusivity rights to the machines necessary.
Of course, there's nothing stopping Microsoft or Samsung to do the same in theory... the problem these companies have is that building the war chest necessary would drag down their stonk price way too much.
https://web.archive.org/web/20201108182313/http://atomicdeli...
https://www.capitaladvisors.com/research/war-chest-exploring...
They just don’t want to bet they can deploy it successfully in the hardware market to compete with Apple, so they focus on other things (cloud services, ads, media, etc).
Microsoft has a bit more hardware sales exposure from its consoles, but not for PCs. They don't have a need for revolutionary "it looks cool" stuff that Apple has.
Amazon, same thing. They brand their own products as the cheap baseline, again no need.
And Meta, all they do is VR stuff. And they did invest(ed?) tons of money into that.
You call five hours good?! Damn... For productivity use, I'd never buy anything below shift-endurance (eight hours or more).
But then the numbers are hardly comparable without having comparable workloads. If I were regularly running builds or had some other moderate load throughout a working day, that'd probably cost a couple of hours.
Wherever possible, I send “pkill -STOP” to all those processes, and stall them and thus save battery…
I half wonder if that’s part of the issue with Windows PCs and their battery life. The OS requires so much extra monitoring just to protect itself that it ends up affecting performance and battery life significantly. It wouldn’t be surprising to me if this alone was the major performance boost Macs have over Windows laptops.
It is incredible that crowdstrike is still operating as a business.
It is also hard to understand why companies continue to deploy shoddy, malware-like "security" software that decreases reliability while increasing the attack surface.
Basically you need another laptop just to run the "security" software.
If operating systems weren't as poop as they are today, this would not be necessary - but here we are. And I bet you major OS manufacturers will not really fix their OSes without ensuring its just a fully walled garden (terrible for devs.. but you'll probably just run a linux vm for dev on top..). Bad intents lead to bad software.
The only portable M device I heavily used on the go was my iPad Pro.
That thing could survive for over a week if not or lightly used. But as soon as you open Lightroom to process photos, the battery would melt away in an hour or two.
But I can imagine some people have different needs and may not have access to (enough) power outlets. Some meeting/conference rooms had only a handful outlets for dozens of people. Definitely nice to survive light office work for a full working day.
I feel like I've tried several times to get this working in both Linux and Windows on various laptops and have never actually found a reliable solution (often resulting in having a hot and dead laptop in my backpack).
I ended up moving to hybrid, where it suspends for an hour allowing immediate wake up then hibernates completely. It’s a decent compromise and I’ve never once had an issue with resume from suspend or hibernate, nor have I ever had an issue with it randomly waking up and frying itself in a backpack or unexpectedly having a dead battery.
My work M1 is still superior in this regard but it is an acceptable compromise.
Windows laptops are still worse, but i appreciate Apple continuing to give me reasons to hate them
As a layman there’s no way I’m running something called “Pop!_OS” versus Mac OS.
I like macOS fine, I have been using Macs since 1984 (though things like SIP grate).
RAM in particular can be a big performance bottleneck, Apple M as way better bandwidth than most x86 CPUs, having well specified RAM chips soldered right next to the CPU instead of having to support DIMM modules certainty helps. AMD AI MAX chips, which also have great memory bandwidth and the most comparable to Apple M also use soldered RAM.
Maybe some details like ARM having a more efficient instruction decoder plays a part, but I don't believe it is that significant.
It runs Debian headless now (I didn't have particular use for a laptop in the first place). Not sure just how unpopular this suggestion'd be, but I'd try booting Windows on the laptop to get an idea of how it's supposed to run.
First, op is talking about Chrome which is not an Apple software. And I can testify that I observed the same behavior with other software which are really not optimized for macOS or even at all. Jetbrains IDEs are fast on M*.
Also, processor manufacturers are contributors of the Linux kernel and have economical interest in having Linux behave as fast as they can on their platforms if they want to sell them to datacenters.
I think it’s something else. Probably unified the memory ?
I remember disassembling Apple’s memcpy function on ARM64 and being amazed at how much customization they did just for that little function to be as efficient as possible for each length of a (small) memory buffer.
https://github.com/bminor/glibc/tree/master/sysdeps/aarch64/...
and there are five versions specialised for either specific CPU models or for available architecture features.
You can run Linux on a MacBook Pro and get similar power efficiency.
Or run third party apps on macOS and similarly get good efficiency.
What? No. Asahi is spectacular for what it accomplished, but battery life is still far worse than macOS.
I am not saying that it is only software. It's everything from hardware to a gazillion optimizations in macOS.
The things where it lags are anything that use hardware acceleration or proper lowering to the lower power states.
edit: whoever downvoted - please explain, what's wrong with preferring VMWare? also, for me, historically (2007-2012), it's been more performant, but didn't use it lately.
Also, here's proof that M4 Max running Parallels is the fastest Windows laptop: https://browser.geekbench.com/v6/cpu/compare/13494385?baseli...
M4 Max is running macOS running Parallels running Windows and is only using 14 out of 16 possible cores and it's still faster than AMD's very best laptop chip.
In the x86 laptop space the 'big' vendors like Dell, HP, Asus, Lenovo, Etc. Can do that sort of thing. Framework doesn't have the leverage yet. Linux is an issue too because that community isn't aligned either.
Alignment is facilitated by mutual self interest, vendors align because they want your business, etc. The x86 laptop industry has a very wide set of customer requirements, which is also challenging (need lots of different kinds of laptops for different needs).
The experience is especially acute when one's requirements for a piece of equipment have strayed from the 'mass market' needs so the products offered are less and less aligned with your needs. I feel this acutely as laptops move from being a programming tool to being an applications product delivery tool.
However, this doesn't really hold up as the cause for the difference. The Zen4/5 chips, for example, source the vast majority of their instructions out of their uOp trace cache, where the instructions have already been decoded. This also saves power - even on ARM, decoders take power.
People have been trying to figure out the "secret sauce" since the M chips have been introduced. In my opinion, it's a combination of:
1) The apple engineers did a superb job creating a well balanced architecture
2) Being close to their memory subsystem with lots of bandwidth and deep buffers so they can use it is great. For example, my old M2 Pro macbook has more than twice the memory bandwidth than the current best desktop CPU, the zen5 9950x. That's absurd, but here we are...
3) AMD and Intel heavily bias on the costly side of the watts vs performance curve. Even the compact zen cores are optimized more for area than wattage. I'm curious what a true low power zen core (akin to the apple e cores) would do.
uop made sense with 32-bit support because the 32-bit ISA was so complex (though still simple compared to x86). Once they went to a simplified instruction design, the cost to decode every single time was lower than the cost of maintaining the uop cache.
Maybe run Geekbench 6 and see.
Closest I've seen is an uncited Reddit thread talking about usb c charging draw when running a task, conflating it with power usage.
ARM has better /security/ though - not only does it have more modern features but eg variable length instructions also mean you can reinterpret them by jumping into the middle of one.
- Try running powertop to see if it says what the issue is.
- Switch to firefox to rule out chrome misconfigurations.
- If this is wayland, try x11
I have an amd SOC desktop and it doesn’t spin up the fans or get warm unless its running a recent AAA title or an LLM. (I’m running devuan because most other distros I’ve tried aren’t stable enough these days).
In scatterplots of performance vs wattage, AMD and Apple silicon are on the same curve. Apple owns the low end and AMD owns the high end. There’s plenty of overlap in the middle.
2. Much more cache
3. No legacy code
4. High frequencies (to be 1st in game benchmarks, see what happens when you're a little behind like the last Intel launch, the perception is Intel has bad CPUs because they are some percentage points behind AMD on games, pressure Apple doesn't have - comparisons are mostly Apple vs. Apple and Intel vs. Amd)
The engineers at AMD are the same as at Apple, but both markets demand different chips and they get different chips.
Since some time now the market is talking about energy efficiency, and we see
1. AMD soldering memory close to the CPU
2. Intel and AMD adding more cache
3. Talks about removing legacy instructions and bit widths
4. Lower out of the box frequencies
Will take more market pressure and more time though.
That's actually wild. I think we're in a kind of unique moment, but one that is good for Apple mainly, because their OS is so developer-hostile that I pay back all the performance gains with interest. T_T
I wonder what specs a MacBook would need to give me similar performance. For example, on Linux with 32 GB of RAM, I can sometimes have 4 or 5 instances of WebStorm open and forget about them running in the background. Could a MacBook with 16 GB of RAM handle that? Similarly, which MacBook processor would give me the real-world, daily-use performance I get from my 14700H? Should I continue using cheap and powerful Windows/Linux laptops in the future, or should I make the switch to a MacBook?
(Translated from my native language to English using Gemini.)
I don't like macOS, so in recent years, I only use it on laptop (which for me is like, a few on-site meetings per year, plus a few airplane flights). What infuriates me is that my mid-tier Mac laptop for those use cases is now significantly faster than any Linux workstation I can possibly buy... and positively annihilates any non-Apple laptop machine on essentially every meaningful benchmark.
The biggest quality of life issue for me personally is the trackpad. Although support for gestures and so on has gotten quite decent in Linux land, Parallels only sends the VM scroll wheel events, so there's no way to have smooth scrolling and swipe gestures inside the VM, so it feels much worse than native macOS or Asahi Linux running on the bare metal.
OTOH if you're fine with macOS GUI but you want something like WSL for CLI and server apps, there's https://lima-vm.io
In special cases, such as not caring about battery life, x86 can run circles around M1. If you allow the CPU rated for 400W to actually consume that amount of power, it's going to annihilate the one that sips down 35W. For many workloads it is absolutely worth it to pay for these diminishing returns.
- A properly written firmware. All Chromebooks are required to use Coreboot and have very strict requirements on the quality of the implementation set by Google. Windows laptops don't have that and very often have very annoying firmware problems, even in the best cases like Thinkpads and Frameworks. Even on samples from those good brands, just the s0ix self-tester has personally given me glaring failures in basic firmware capabilities.
- A properly tuned kernel and OS. ChromeOS is Gentoo under the hood and every core service is afaik recompiled for the CPU architecture with as many optimisations enabled. I'm pretty sure that the kernel is also tweaked for battery life and desktop usage. Default installations of popular distros will struggle to support this because they come pre-compiled and they need to support devices other than ultrabooks.
Unfortunately, it seems like Google is abandoning the project altogether, seeing as they're dropping Steam support and merging ChromeOS into Android. I wish they'd instead make another Pixelbook, work with Adobe and other professional software companies to make their software compatible with Proton + Wine, and we'd have a real competitor to the M1 Macbook Air, which nothing outside of Apple can match still.
This isnt the only no-show position I've heard about at Intel. That is why Intel cannot catch up. You probably cannot get away with that at Apple.
the build quality of surface laptop is superb also.
Intel provides processors for many vendors and many OS. Changing to a new architecture is almost impossible to coordinate. Apple doesn't have this problem.
Actually in de 90s Intel and Microsoft wanted to move to a RISC architecture but Compaq forced them to stay on x86.
Windows NT has always been portable, but didn't provide any serious compat with Windows 4.x until 5.0. At that time, AMD released their 64-bit extension to x86. Intel wanted to build their own, Microsoft went "haha no". By that time they've been dictating the CPU architecture.
I guess at that point there was very little reason to switch. Intel's Core happened; Apple even went to Intel to ask for a CPU for what would become the iPhone - but Intel wasn't interested.
Perhaps I'm oversimplifying, but I think it's complacency. Apple remained agile.
Your memory served you wrong. Experience eith Intel based Macs was much worse than recent AMD chips.
My 2019 Thinkpad T495 (Ryzen 3600) does get hot under load, but it's still fine to type on.
(Edit, I read lower in the thread that the software platform also needs to know how to make efficient use of this performance per watt, ie, by not taking all the watts you can get.)
[0] https://www.phoronix.com/review/ryzen-ai-max-395-9950x-9950x...
An Airbook sets me back €1000, enough to buy a used car, and AFAICT is much more difficult to get fully working Linux on than my €200 amd64 build.
Why hasn't apple caught up?
And he was right. Netbooks mostly sucked. Same with Chromebooks.
There’s nothing to be gained by racing to the bottom.
You can buy an m1 laptop for $599 at Walmart. That’s an amazing deal.
> You can buy ... for $599
Not sure why you'd think any random nerd has that kind of money. And Walmart isn't exactly around the corner for most parts of the world.Precisely because of that they haven't caught up. They don't want to compete in the PC race to them bottom that nearly bankrupted them in the 90s before they invented the iPod.
Apple got rich by creating its own markets.
My experience has been to the contrary. Moving to Linux a couple months ago from Windows doubled my battery life and killed almost all the fan noise.
My Apple friends get 12+ hrs of battery life. I really wish Lenovo+Fedora or whoever would get together and make that possible.
Now, 7.5 years later, the battery is not so healthy any more, and I'm looking around for something similar, and finding nothing. I'm seriously considering just replacing the battery. I'll be stuck with only 8GB RAM and an ancient CPU, but it still looks like the best option.
Another useful thing is that you can buy small portable battery packs that are meant for jump-starting car engines, and they have a 12V output (probably more like 14V), which could quite possibly be piped straight into the DC input of a laptop. My laptop asks for 19V, but it could probably cope with this.
It's not 8-12, and the fans do kick up. The track pad is fine but not as nice as the one on the MacBook. But I prefer to run Linux so the tradeoff is worth it to me.
That doesn't sound super secure to me.
> for five hours.
My experience with anything that is not designed to be an office is that it will be uncomfortable in the long run. I can't see myself working for 5 hours in that kind of place.
Also it seems it is quite easily solved with an external battery pack. They may not last 12hours but they should last 4 to 6 hours without a charge in powersaving mode.
Don't you drink any coffee in the coffee shop? I hope you do. But, still, being there for /five/ hours is excessive.
I'm guessing you're well aware, but just in case you're not: Asahi Linux is working extremely well on M1/M2 devices and easily covers your "5 hours of work at a coffee shop" use case.
HP has Ubuntu-certified strix halo machines for example.
just... take your charger...
Apple often lets the device throttle before it turns on the fans for "better ux" linux plays no such mind games.
That being said, my M2 beats the ... out of my twice as expensive work laptop when compiling an arduino project. Literall jaw drop the first time I compiled on the M2.
Apple M1: 23.3
Apple M4: 28.8
Ryzen 9 7950X3D (from 2023, best x86): 10.6
All other x86 were less efficient.The Apple CPUs also beat most of the respective same-year x86 CPUs in Cinebench single-thread performance.
[1] https://www.heise.de/tests/Ueber-50-Desktop-CPUs-im-Performa... (paywalled, an older version is at https://www.heise.de/select/ct/2023/14/2307513222218136903#&...)
If you actually benchmark said chips in a computational workload I'd imagine the newer chip should handily beat the old M1.
I find both windows and Linux have questionable power management by default.
On top of that, snappiness/responsiveness has very little to do with the processor and everything to do with the software sitting on top of it.
Am learning x86 in order to build nice software for the Framework 12 i3 13-1315U (raptor lake). Going into the optimization manuals for intel's E-cores (apparently Atom) and AMD's 5c cores. The efficiency cores on the M1 MacBook Pro are awesome. Getting debian or Ubuntu with KDE to run this on a FW12 will be mind-boggling.
Note those docker containers are running in a linux VM!
Of course they are on Windows (WSL2) as well.
Windows on the other hand is horribly optimized, not only for performance, but also for battery life. You see some better results from Linux, but again it takes a while for all of the optimizations to trickle down.
The tight optimization between the chip, operating system, and targeted compilation all come together to make a tightly integrated product. However comparing raw compute, and efficiency, the AMD products tend to match the capacity of any given node.
However, with AMD Strix Halo aka AMD Ryzen AI Max+ 395 (PRO) there are Notebooks like the ZBook Ultra G1a and Tablets like the Asus ROG Flow Z13, that come close to the MacBook power / performance ratio[2] due to the fact, that they used high bandwidth soldered on memory, which allows for GPUs with shared VRAM similar to Apple's strategy.
Framework did not manage to put this thing in notebook yet, but shipped a Desktop variant. They also pointed out, that there was no way to use LPCAMM2 or any other modular RAM tech with that machine, because it would have slowed it down / increased latencies to an unusable state.
So I'm pretty sure the main reason for Apple's success is the deeply integrated architecture and I'm hopeful that AMD's next generation STRIX Halo APUs might provide this with higher efficiency and hopefully Framework adapts these chips in their notebooks. Maybe they just did in the 16?! Let's wait for this announcement: https://www.youtube.com/watch?v=OZRG7Og61mw
Regarding the deeply thought through integration there is a story I often tell: Apple used to make iPods. These had support for audio playback control with their headphone remotes (e.g. EarPods), which are still available today. These had a proprietary ultra sonic chirp protocol[3] to identify Apple devices and supported volume control and complex playback control actions. You could even navigate through menus via voiceover with longpress and then using the volume buttons to navigate. Until today with their USB-C-to-AudioJack Adapters these still work on nearly every apple device published after 2013 and the wireless earbuds also support parts of this. Android has tried to copy this tiny little engineering wonder, but until today they did not manage to get it working[4]. They instead focus on their proprietary "longpress" should work in our favour and start "hey google" thing, which is ridiculously hard to intercept / override in officially published Android apps... what a shame ;)
1: https://youtu.be/51W0eq7-xrY?t=773
2: https://youtu.be/oyrAur5yYrA
There are different kinds of transistors that can be used when making chips. There are slow, but efficient transistors and fast, but leaky transistors. Getting an efficient design is a balancing act where you limit use of the fast transistors to only the most performance critical areas. AMD historically has more liberally used these high performance leaky transistors, which enabled it to reach some of the highest clock frequencies in the industry. Apple on the other hand designed for power efficiency first, so its use of such transistors was far more conservative. Rather than use faster transistors, Apple would restrict itself to the slower transistors, but use more of them, resulting in wider core designs that have higher IPC and matched the performance of some of the best AMD designs while using less power. AMD recently adopted some of Apple’s restraint when designing the Zen 5c variant of its architecture, but it is just a modification of a design that was designed for significant use of leaky transistors for high clock speeds:
https://www.tomshardware.com/pc-components/cpus/amd-dishes-m...
The resulting clock speeds of the M4 and the Ryzen AI 340 are surprisingly similar, with the M4 at 4.4GHz and the Ryzen AI 340 at 4.8GHz. That said, the same chip is used in the Ryzen AI 350 that reaches 5.0GHz.
There is also the memory used. Apple uses LPDDR5X on the M4, which runs at lower voltages and has tweaks that sacrifice latency to an extent for a big savings in power. It also is soldered on/close to the CPU/SoC for a reduction needed in power to transmit data to/from the CPU. AMD uses either LPDDR5X or DDR5. I have not kept track of the difference in power usage between DDR versions and their LP variants, but expect the memory to use at least half the power if not less. Memory in many machines can use 5W or more just at idle, so cutting memory power usage can make a big impact.
Additionally, x86 has a decode penalty compared to other architectures. It is often stated that this is negligible, but those statements began during the P4 era when a single core used ~100W where a ~1W power draw for the decoder really was negligible. Fast forward to today where x86 is more complex than ever and people want cores to use 1W or less, the decode penalty is more relevant. ARM, using fixed length instructions and having a fraction of the instructions, uses less power to decode its instructions, since its decoder is simpler. To those who feel compelled to reply to repeat the mantra that this is negligible, please reread what I wrote about it being negligible when cores use 100W each and how the instruction set is more complex now. Let’s say that the instruction decoder uses 250mW for x86 and 50mW for ARM. That 200mW difference is not negligible when you want sub-1W core energy usage. It is at least 20% of the power available to the core. It does become negligible when your cores are each drawing 10W like in AMD’s desktops.
Apple also has taken the design choice of designing its own NAND flash controller and integrating it into its SoC, which provides further power savings by eliminating some of the power overhead associated with an external NAND flash controller. Being integrated into the SoC means that there is no need to waste power on enabling the signals to travel very far, which gives energy savings, versus more standard designs that assume a long distance over a PCB needs to be supported.
Finally, Apple implemented an innovation for timer coalescing in Mavericks that made a fairly big impact:
https://www.imore.com/mavericks-preview-timer-coalescing
On Linux, coalescing is achieved by adding a default 50ms slack to traditional Unix timers. This can be changed, but I have never seen anyone actually do that:
https://man7.org/linux/man-pages/man2/pr_set_timerslack.2con...
That was done to retroactively support coalescing in UNIX/Linux APIs that did not support it (which were all of them). However, Apple made its own new API for event handling called grand central dispatch that exposed coalescing in a very obvious way via the leeway parameter while leaving the UNIX/BSD APIs untouched, and this is now the preferred way of doing event handling on MacOS:
https://developer.apple.com/documentation/dispatch/1385606-d...
Thus, a developer of a background service on MacOS that can tolerate long delays could easily set the slack to multiple seconds, which would essentially guarantee it would be coalesced with some other timer, while a developer of a similar service on Linux, could, but probably will not, since the scheduler slack is something that the developer would need to go out of his way to modify, rather than something in his face like the leeway parameter is with Apple’s API. I did check how this works on Windows. Windows supports a similar per timer delay via SetCoalescableTimer(), but the developer would need to opt into this by using it in place of SetTimer() and it is not clear there is much incentive to use it. To circle back not Chrome, it uses libevent, which uses the BSD kqueue on MacOS. As far as I know, kqueue does not take advantage of timer coalescing on macOS, so the mavericks changes would not benefit chrome very much and the improvements that do benefit chrome are elsewhere. However, I thought that the timer coalescing stuff was worthwhile to mention given that it applies to many other things on MacOS.
Here's a video about it. Skip to 4:55 for battery life benchmarks. https://www.youtube.com/watch?v=ymoiWv9BF7Q
In single threaded CPU performance, M4 Pro is roughly 3.6x more efficient while also being 50% faster.
Is that your metric of performance? If so...
$ sudo cpufreq-set -u 50MHz
done!It's a design choice.
Also, different Linux distros/DEs prioritize different things. Generally they prioritize performance over battery life.
That being said, I find Debian GNOME to be the best on battery life. I get 6 hours on an MSI laptop that has an 11th gen Intel processor and a battery with only 70% capacity left. It also stays cool most of the time (except gaming while being plugged in) but it does have a fan...
I would try it with Windows for a better comparison, or get into the weeds of getting Linux to handle the ryzen platform power settings better.
With Ubuntu properly managing fans and temps and clocks, I'll take it over the Mac 10/10 times.
Same, I just realized it's three years old, I've used every day for hours and it still feels like the first day I got it.
They truly revindicated on this as their laptops were getting worse and worse and worse (keyboard fiasco, touchbar, ...).
Imagine that you made an FPGA do x86 work, and then you wanted to optimize libopenssl, or libgl, or libc. Would you restrict yourself to only modifying the source code of the libraries but not the FPGA, or would you modify the processor to take advantage of new capabilities?
For made-up example, when the iPhone 27 comes out, it won’t support booting on iOS 26 or earlier, because the drivers necessary to light it up aren’t yet published; and, similarly, it can have 3% less battery weight because they optimized the display controller to DMA more efficiently through changes to its M6 processor and the XNU/Darwin 26 DisplayController dylib.
Neither Linux, Windows, nor Intel have shown any capability to plan and execute such a strategy outside of video codecs and network I/O cards. GPU hardware acceleration is tightly controlled and defended by AMD and Nvidia who want nothing to do with any shared strategy, and neither Microsoft nor Linux generally have shown any interest whatsoever in hardware-accelerating the core system to date — though one could theorize that the Xbox is exempt from that, especially given the Proton chip.
I imagine Valve will eventually do this, most likely working with AMD to get custom silicon that implements custom hardware accelerations inside the Linux kernel that are both open source for anyone to use, and utterly useless since their correct operation hinges on custom silicon. I suspect Microsoft, Nintendo, and Sony already do this with their gaming consoles, but I can’t offer any certainty on this paragraph of speculation.
x86 isn’t able to keep up because x86 isn’t updated annually across software and hardware alike. M1 is what x86 could have been if it was versioned and updated without backwards compatibility as often as Arm was. it would be like saying “Intel’s 2026 processors all ship with AVX-1024 and hardware-accelerated DMA, and the OS kernel (and apps that want the full performance gains) must be compiled for its new ABI to boot on it”. The wreckage across the x86 ecosystem would be immense, and Microsoft would boycott them outright to try and protect itself from having to work harder to keep up — just like Adobe did with Apple M1, at least until their userbase starting canceling subscriptions en masse.
That’s why there are so many Arm Linux architectures: for Arm, this is just a fact of everyday life, and that’s what gave the M1 such a leg up in x86: not having to support anything older than your release date means you can focus on the sort of boring incremental optimizations that wouldn’t be permissible in a “must run assembly code written twenty years ago” environment assumed by Lin/Win today.
The GPU is significantly different from other desktop GPUs but it's in principle like other mobile GPUs, so not sure how much better Linux could be adapted there.
TIL:
https://en.wikipedia.org/wiki/Monopole_(company)#Racing_cars
I was kind of hoping that there was some little-known x84 standard that never saw the light of day, but instead all I found was classic French racing cars.
The only real annoying thing I've found with the P14s is the Crowdstrike junk killing battery life when it pins several cores at 100% for an hour. That never happened in MacOS. These are corporate managed devices I have no say in, and the Ubuntu flavor of the corporate malware is obviously far worse implemented in terms of efficiency and impact on battery life.
I recently built myself a 7970X Threadripper and it's quite good perf/$ even for a Threadripper. If you build a gaming-oriented 16c ryzen the perf/$ is ridiculously good.
No personal experience here with Frameworks, but I'm pretty sure Jon Blow had a modern Framework laptop he was ranting a bunch about on his coding live streams. I don't have the impression that Framework should be held as the optimal performing x86 laptop vendor.
Oh you've gotten lucky then. Or somehow disabled crowdstrike.
https://www.crowdstrike.com/en-us/blog/crowdstrike-supports-...
They happily implement a userland version on macOS, but then claimed that being in the kernel is absolutely necessary on Windows after they disabled all Windows machines using it.
I've got the Framework 13 with the Ryzen 5 7640U and I routinely have dozens of tabs open, including YouTube videos, docker containers, handful of Neovim instances with LSPs and fans or it getting hot have never been a problem (except when I max out the CPU with heavy compilation).
The issue you're seeing isn't because x86 is lacking but something else in your setup.
Looking beyond Apple/Intel, AMD recently came out with a cpu that shares memory between the GPU and CPU like the M processors.
The Framework is a great laptop - I'd love to drop a mac motherboard into something like that.
In terms of performance though, those N4P Ryzen chips have knocked it out of the park for my use-cases. It's a great architecture for desktop/datacenter applications, still.
Who here would be interested in testing a distro like debian with builds optimized for the Framework devices?
Once you normalize for either efficiency cores or performance cores, you'll quickly realize that the node lead is the largest advantage Apple had. Those guys were right, the writing was on the wall in 2019.
A big thing is storage. Apple uses extremely fast storage directly attached to the SoC and physically very very close. In contrast, most x86 systems use storage that's socketed (which adds physical signal runtime) and that goes via another chip (southbridge). That means, unlike Mac devices that can use storage as swap without much practical impact, x86 devices have a serious performance penalty.
Another part of the issue when it comes to cooling is that Apple is virtually the only laptop manufacturer that makes solid full aluminium frames, whereas most x86 laptops are made out of plastic and, for higher-end ones, magnesium alloy. That gives Apple the advantage of being able to use the entire frame to cool the laptop, allowing far more thermal input before saturation occurs and the fans have to activate.
Why would PCIe SSDs need to go through a southbridge? The CPU itself provides PCIe lanes that can be used directly.
> That means, unlike Mac devices that can use storage as swap without much practical impact, x86 devices have a serious performance penalty.
Swap is slow on all hardware. No SSD comes close to the speed of RAM - not even Apple's. Latency is also significantly worse when you trigger a page fault and then need to wait for the page to load from disk before the thread can resume execution.
It does, but if you look at the mainboard manuals of computers, usually it's 32 lanes of which 16 go to the GPU slot and 16 to the southbridge, so no storage directly attached to the CPU. Laptops are just as bad.
Intel has always done price segmentation with the number of PCIe lanes exposed to the world.
Threadripper AMD CPUs are a different game, but I'm not aware of anyone, even "gamer" laptops, sticking such a beast into a portable device.
> Latency is also significantly worse when you trigger a page fault and then need to wait for the page to load from disk before the thread can resume execution.
Indeed, but the difference in performance between an 8GB Windows laptop and an 8GB M-series Apple laptop is noticeable, even if all it's running is the base OS and Chrome with a few dozen tabs.
Why would the southbridge need a whole 16 lanes? That's 32 GB/s of bandwidth (or 64, if PCIe 5). My (AMD) motherboard has the GPU and two M.2 sockets connected directly to the CPU and it's one of the cheaper ones. No idea about my laptop but I expect it to be similar because it's also AMD. Intel is obviously different here because they're more stingy with PCIe lanes.
There should be no reason for a laptop with only an integrated GPU to dangle storage off the southbridge. They take at most 4 lanes and can work with less.
> Indeed, but the difference in performance between an 8GB Windows laptop and an 8GB M-series Apple laptop is noticeable, even if all it's running is the base OS and Chrome with a few dozen tabs.
Any Windows laptop that comes with 8GB of RAM is going to have a crappy SSD included because those are always built to be cheap, not performant. It could even be a SATA SSD (500MB/s bandwidth max). Most likely they'd come with a processor significantly slower and a decent chance the RAM would also be single channel, too.
AFAIK that's not the case at least on AMD (not Threadripper, but the mainstream AM5 socket). They have 28 lanes of which 16 go to the GPU slot, 4 go to the southbridge, 4 are dedicated to M.2 NVMe storage, and 4 go to either another PCIe slot or another M.2 NVMe storage. See for a random example this motherboard manual https://download.asrock.com/Manual/B650M-HDVM.2.pdf which has a block diagram on page 8 (page 12 of the PDF).
ARM is great. Those M are the only thing I could buy used and put Linux on it.
This hasn't been true for decades. Mainframes are fast because they have proprietary architectures that are purpose-built for high throughput and redundancy, not because they're RISC. The pre-eminent mainframe architecture these days (z/Architecture) is categorized as CISC.
Processors are insanely complicated these days. Branch prediction, instruction decoding, micro-ops, reordering, speculative execution, cache tiering strategies... I could go on and on but you get the idea. It's no longer as obvious as "RISC -> orthogonal addressing and short instructions -> speed".
Very much so. It's largely a register-memory (and indeed memory-memory) rather than load-store architecture, and a direct descendant of the System/360 from 1964.
From this we can infer that for most normal workloads, almost 22% of the Haswell core power was used in the decoder. As decoders have gotten wider and more complex in recent designs, I see no reason why this wouldn't be just as true for today's CPUs.
[0] https://www.usenix.org/system/files/conference/cooldc16/cool...
Even though this was the case for the most part during the entire history of PPC Macs (I owned two during these years)
Their claim that ARM decoders are just as complex wasn't true then and is even less true now. ARM reduced decoder size 75% from A710 to A715 by dropping legacy 32-bit stuff. Considering that x86 is way more complex than 32-bit ARM, the difference between an x86 and ARM decoder implementation is absolutely massive.
They abuse the decoder power paper (and that paper also draws a conclusion its own data doesn't support). The data shows that for integers/ALU, some 22% of total core power is used by the decoder for integer/ALU workloads. As 89% of all instructions in the entire Ubuntu repos are just 12 integer/ALU instructions, we can infer that the power cost of the decoder is significant (I'd consider nearly a quarter of the total power budget to be significant anyway).
The x86 decoder situation has gotten worse with Golden Cove (with 6 decoders) being infamous for its power draw and AMD fearing power draw so much that they opted for a super-complex dual 4-wide decoder setup. If the decoder power didn't matter, they'd be doing 10-wide decoders like the ARM designers.
The claim that ARM uses uops too is somewhere between a red herring and false equivalency. ARM uops are certainly less complex to create (otherwise they'd have kept around the uop cache) and ARM instructions being inherently less complex means that uop encoding is also going to be more simple for a given uarch compared to x86.
They then have an argument that proves too much when they say ARM has bloat too. If bloat doesn't matter, why did ARM make an entirely new ISA that ditches backward compatibility? Why take any risk to their ecosystem if there's no reward?
They also skip over the fact that objectively bad design exists. NOBODY out there defends branch delay slots. They are universally considered an active impediment to high-performance designs with ISAs like MIPS going so far as to create duplicate instructions without branch delay slots in order to speed things up. You can't argue that ISA definitely matters here, but also argue that ISA never makes any difference at all.
The "all ISAs get bloated over time" is sheer ignorance. x86 has roots going back to the early 1970s before we'd figured out computing. All the basics of CPU design are now stable and haven't really changed in 30+ years. x86 has x87 which has 80-bits because IEEE 754 didn't exist yet. Modern ISAs aren't repeating that mistake. x86 having 8 registers isn't a mistake they are going to make. Neither is 15 different 128-bit SIMD extensions or any of the many other bloated mess-ups x86 has made over the last 50+ years. There may be mistakes, but they are almost certainly going to be on fringe things. In the meantime, the core instructions will continue to be superior to x86 forever.
They also fail to address implementation complexity. Some of the weirdness of x86 like tighter memory timing gets dragged through the entire system complicating things. If this results in just 10% higher cost and 10% longer development time, that means a RISC company could develop a chip for $5.4B over 4.5 years instead of $6B over 5 years which represents a massive savings and a much lower opportunity cost while giving a compounding head-start on their x86 competitor that can be used to either hit the market sooner or make even larger performance jumps each generation.
Finally, optimizing something like RISC-V code is inherently easier/faster than optimizing x86 code because there is less weirdness to work around. RISC-V basically just has one way to do something and it'll always be optimized while x86 often has different ways to do the same thing and each has different tradeoffs that make sense in various scenarios.
As to PPC, Apple didn't sell enough laptops to pay for Motorola to put enough money into the designs to stay competitive.
Today, Apple macbooks + phones move nearly 220M chips per year. For comparison, total laptop sales last year were around 260M. If Apple had Motorola make a chip today, Motorola would have the money to build a PPC chip that could compete with and surpass what x86 offers.
And don’t forget that Apple can do things like completely remove all of the hardware that supports 32 bit code and tell developers to just deal with it.
https://www.intel.com/content/www/us/en/developer/articles/t...
Change TDP, TDC, etc. and fan curves if you don't like the thermal behavior. Your Ryzen has low enough power draw that you could even just cool it passively. It has a lower power draw ceiling than your M1 Pro while exceeding it in raw performance.
Also comparing chips based on transistor density is mostly pointless if you don't also mention die size (or cost).
I did ask LLM for some stats about this. According to Claude Sonnet 4 through VS Code (for what that's worth), my Macbook's display can consume same or even more power than CPU does for "office work". Yet my M1 Max 16" seems to last a good while longer than whatever it was I got from work this year. I'd like to know how those stats are produced (or are they hallucinated...). There doesn't seem to be a way to get display's power usage in M series Macs. So, you'd need to devise a testing regime for display off and display on 100% brightness to get some indication of its effect on power use.