Last week, Apple made industry news by announcing new Mac products based upon the company’s new Apple Silicon M1 SoC chip, marking the first move of a planned 2-year roadmap to transition over from Intel-based x86 CPUs to the company’s own in-house designed microprocessors running on the Arm instruction set.

During the launch we had prepared an extensive article based on the company’s already related Apple A14 chip, found in the new generation iPhone 12 phones. This includes a rather extensive microarchitectural deep-dive into Apple’s new Firestorm cores which power both the A14 as well as the new Apple Silicon M1, I would recommend a read if you haven’t had the opportunity yet:

Since a few days, we’ve been able to get our hands on one of the first Apple Silicon M1 devices: the new Mac mini 2020 edition. While in our analysis article last week we had based our numbers on the A14, this time around we’ve measured the real performance on the actual new higher-power design. We haven’t had much time, but we’ll be bringing you the key datapoints relevant to the new Apple Silicon M1.

Apple Silicon M1: Firestorm cores at 3.2GHz & ~20-24W TDP?

During the launch event, one thing that was in Apple fashion typically missing from the presentation were actual details on the clock frequencies of the design, as well as its TDP which it can sustain at maximum performance.

We can confirm that in single-threaded workloads, Apple’s Firestorm cores now clock in at 3.2GHz, a 6.66% increase over the 3GHz frequency of the Apple A14. As long as there's thermal headroom, this clock also applies to all-core loads, with in addition to 4x 3.2GHz performance cores also seeing 4x Thunder efficiency cores at 2064MHz, also quite a lot higher than 1823MHz on the A14.

Alongside the four performance Firestorm cores, the M1 also includes four Icestorm cores which are aimed for low idle power and increased power efficiency for battery-powered operation. Both the 4 performance cores and 4 efficiency cores can be active in tandem, meaning that this is an 8-core SoC, although performance throughput across all the cores isn’t identical.

The biggest question during the announcement event was the power consumption of these designs. Apple had presented several charts including performance and power axes, however we lacked comparison data as to come to any proper conclusion.

As we had access to the Mac mini rather than a Macbook, it meant that power measurement was rather simple on the device as we can just hook up a meter to the AC input of the device. It’s to be noted with a huge disclaimer that because we are measuring AC wall power here, the power figures aren’t directly comparable to that of battery-powered devices, as the Mac mini’s power supply will incur a efficiency loss greater than that of other mobile SoCs, as well as TDP figures contemporary vendors such as Intel or AMD publish.

It’s especially important to keep in mind that the figure of what we usually recall as TDP in processors is actually only a subset of the figures presented here, as beyond just the SoC we’re also measuring DRAM and voltage regulation overhead, something which is not included in TDP figures nor your typical package power readout on a laptop.

Apple Mac mini (Apple Silicon M1) AC Device Power

Starting off with an idle Mac mini in its default state while sitting idle when powered on, while connected via HDMI to a 2560p144 monitor, Wi-Fi 6 and a mouse and keyboard, we’re seeing total device power at 4.2W. Given that we’re measuring AC power into the device which can be quite inefficient at low loads, this makes quite a lot of sense and represents an excellent figure.

This idle figure also serves as a baseline for following measurements where we calculate “active power”, meaning our usual methodology of taking total power measured and subtracting the idle power.

During average single-threaded workloads on the 3.2GHz Firestorm cores, such as GCC code compilation, we’re seeing device power go up to 10.5W with active power at around 6.3W. The active power figure is very much in line with what we would expect from a higher-clocked Firestorm core, and is extremely promising for Apple and the M1.

In workloads which are more DRAM heavy and thus incur a larger power penalty on the LPDDR4X-class 128-bit 16GB of DRAM on the Mac mini, we’re seeing active power go up to 10.5W. Already with these figures the new M1 is might impressive and showcases less than a third of the power of a high-end Intel mobile CPU.

In multi-threaded scenarios, power highly depends on the workload. In memory-heavy workloads where the CPU utilisation isn’t as high, we’re seeing 18W active power, going up to around 22W in average workloads, and peaking around 27W in compute heavy workloads. These figures are generally what you’d like to compare to “TDPs” of other platforms, although again to get an apples-to-apples comparison you’d need to further subtract some of the overhead as measured on the Mac mini here – my best guess would be a 20 to 24W range.

Finally, on the part of the GPU, we’re seeing a lower power consumption figure of 17.3W in GFXBench Aztec High. This would contain a larger amount of DRAM power, so the power consumption of Apple’s GPU is definitely extremely low-power, and far less than the peak power that the CPUs can draw.

Memory Differences

Besides the additional cores on the part of the CPUs and GPU, one main performance factor of the M1 that differs from the A14 is the fact that’s it’s running on a 128-bit memory bus rather than the mobile 64-bit bus. Across 8x 16-bit memory channels and at LPDDR4X-4266-class memory, this means the M1 hits a peak of 68.25GB/s memory bandwidth.

In terms of memory latency, we’re seeing a (rather expected) reduction compared to the A14, measuring 96ns at 128MB full random test depth, compared to 102ns on the A14.

Of further note is the 12MB L2 cache of the performance cores, although here it seems that Apple continues to do some partitioning as to how much as single core can use as we’re still seeing some latency uptick after 8MB.

The M1 also contains a large SLC cache which should be accessible by all IP blocks on the chip. We’re not exactly certain, but the test results do behave a lot like on the A14 and thus we assume this is a similar 16MB chunk of cache on the SoC, as some access patterns extend beyond that of the A14, which makes sense given the larger L2.

One aspect we’ve never really had the opportunity to test is exactly how good Apple’s cores are in terms of memory bandwidth. Inside of the M1, the results are ground-breaking: A single Firestorm achieves memory reads up to around 58GB/s, with memory writes coming in at 33-36GB/s. Most importantly, memory copies land in at 60 to 62GB/s depending if you’re using scalar or vector instructions. The fact that a single Firestorm core can almost saturate the memory controllers is astounding and something we’ve never seen in a design before.

Because one core is able to make use of almost the whole memory bandwidth, having multiple cores access things at the same time don’t actually increase the system bandwidth, but actually due to congestion lower the effective achieved aggregate bandwidth. Nevertheless, this 59GB/s peak bandwidth of one core is essentially also the speed at which memory copies happen, no matter the amount of active cores in the system, again, a great feat for Apple.

Beyond the clock speed increase, L2 increase, this memory boost is also very likely to help the M1 differentiate its performance beyond that of the A14, and offer up though competition against the x86 incumbents.

Benchmarks: Whatever Is Available
Comments Locked

682 Comments

View All Comments

  • Spunjji - Tuesday, November 17, 2020 - link

    @halo37253 I suspect you're largely correct based on what we're seeing in the benchmarks here.

    Of course, the answer to why Apple would do it is clear: they love vertical integration. They'll eventually be able to translate this into power/performance advantages that will be difficult to assail with apps written specifically for their platform.
  • mdriftmeyer - Friday, November 20, 2020 - link

    Apple will have to modify their future M1s to accomodate PCIe because a large portion of the Audio Video Professional world needs it--in fact we all rely on DMA over PCI for Thunderbolt to reduce latency, and nothing like throwing away a $5k-$25k stack of Audio Interface, Mic Pres and more just because Apple wants to drop that, or just simply dump Apple and move back to Windows and deal with DLLs. I hate Windows but I sure as hell won't drop expensive gear tied with Dante Ethernet and TB3 interfacing with various Audio Interfaces and rack mount hardware because Apple thinks the Pro market only needed the Mac Pro one off before dropping us off a cliff.

    No one in the world of Professional Music uses Logic Pro stock plugins and the average track has any where between 80-200 channel strips to manage one mix. If you think the M1 or its predecessors with this type of tightly joined unified memory system will satisfy people are just not familiar to how many resources making professional music or film production require.

    Let's not even talk about 3D Modeling for F/X in Films or full blown PIXAR style film shorts, never mind full length motion pictures. Working in 8k and soon 16k film to have real-time scrubbing will demand new versions of the Mac Pro's Afterburner and upgraded Xeons [or if they were smart, Zens] but definitely not M series SoCs.
  • Spunjji - Monday, November 23, 2020 - link

    @mrdriftmeyer - I don't see that any of the requirements you've mentioned here would preclude Apple producing an M1 successor that would be capable of fulfilling them. In particular you mentioned 8K video scrubbing, which the M1 can already do better than the average Xeon. I doubt they'd throw away the audio market entirely over this switch - I guess we'll just have to wait and see what the next chips look like.
  • varase - Wednesday, November 25, 2020 - link

    Most people are looking at these first Apple Silicon Macs wrong - these aren't Apple's powerhouse machines: they're simply the annual spec bump of the lowest end Apple computers with DCI-P3 displays, Wifi 6, and the new Apple Silicon M1 SoC.

    They have the same limitations as the machines they replace - 16 GB RAM and two Thunderbolt ports.

    These are the machines you give to a student or teacher or a lawyer or an accountant or a work-at-home information worker - folks who need a decently performing machine who don't want to lug around a huge powerhouse machine (or pay for one for that matter). They're still marketed at the same market segment, though they now have a vastly expanded compute power envelope.

    The real powerhouses will probably come next year with the M1x (or whatever), rumored to have eight Firestorm and four Icestorm cores. Apple has yet to decide on an external memory interconnect and multichannel PCIe scheme, if they decide to move in that direction.

    Other CPU and GPU vendors and OEM computer makers take notice - your businesses are now on limited life support. These new Apple Silicon models can compete up through the mid-high tier of computer purchases, and if as I expect Apple sells a ton of these many will be to your bread and butter customers.

    In fact, I suspect that Apple - once they recover their R&D costs - will be pushing the prices of these machines lower while still maintaining their margins - while competing computer makers will still have to pay Intel, AMD, Qualcomm, and nVidea for their expensive processors, whereas Apple's cost per SoC goes down the more they manufacture. Competing computer makers may soon be squeezed by Apple Silicon price/performance on one side and high component prices on the other. Expect them to be demanding lower processor prices from the above manufacturers so they can more readily compete, and processor manufacturers may have to comply because if OEM computer manufacturers go under or stop making competing models, the processor makers will see a diminishing customer base.

    I believe the biggest costs for a chip fab are startup costs - no matter what processor vendors would like you to believe. Design and fab startup are _expensive_ - but once you start getting decent yields, the additional costs are silicon wafers and QA. The more of these units Apple can move, the lower the per unit cost and the better the profits.

    The real threat to OEM computer and processor makers are economic - and that fact that consumer publications like Consumer Reports will probably _gush_ over the improvements in battery life and performance.

    Most consumers are not Windows or macOS or ChromeOS fanboys - the just want a computer which is affordable and has decent build quality and gets the job done. There are aspirational aspects of computer purchases, and M1 computers shoot waaayyy above their peers. This can mean a potential buyer _doesn't_ have to buy way up the line for capabilities he or she may want sometime during their ownership window, and these computers will last a long long time and will not suffer slowdowns due to software feature creep.
  • Eric S - Tuesday, November 17, 2020 - link

    Remember that this is designed to be Apple’s lowest end Mac chip. Their Intel i3. Wait until the big chips come out next year.
  • BushLin - Wednesday, November 18, 2020 - link

    ... Your speculation may or may not be correct but next year will see 5nm zen 4 which is actually announced rather than rumors.
  • jospoortvliet - Wednesday, November 18, 2020 - link

    Sure, and 3nm m2. Different generation with different processes etc. But today, M1 has the best single core and at lower power comes close to octacores despite only 4 fast and 4 slow cores. I wish I could buy it with Linux on it...
  • dysonlu - Sunday, February 21, 2021 - link

    "makes we wonder why Apple is so willing to fracture their already pretty small Mac OS fanbase"

    You have it upside down. It is exactly BECAUSE it has a small fanbase that it can afford to do this kind of migration. (The large and heterogenous "fanbase" in Windows is the big achilles' heel for Microsoft when it comes to making any significant change.) There will be very little "fracture" of Apple's fanbase, if any at all. The fans will gladly move to Mx CPUs given the advantages over Intel.
  • adriaaaaan - Thursday, November 19, 2020 - link

    People are giving apple too much credit here, this is only impressive because of the process advantage which has nothing to do with apple.

    People are forgetting that Mac's have a tiny market share and that's not likely to change any time soon. You wouldn't knows it because journos tend to use Mac's therefore they think everyone does.

    If anything I hope this kicks AMD into gear they are still releasing gcn designs. Let's see who's boss when they release 5nm rDNA 2
  • Spunjji - Thursday, November 19, 2020 - link

    "this is only impressive because of the process advantage"

    False. A crap core on a high-tech process will still produce bad results; you only have to look at the last bunch of Zhaoxin CPUs based on the old Via tech.

    If this were just about process node you'd expect to see lower power but with limited performance. As it is, they manage both extremely low power *and* very competitive performance. Beating Intel is no small feat, even in their current incarnation.

Log in

Don't have an account? Sign up now