In November 2019, the company NUVIA broke out of stealth mode. Founded by former senior Apple and Google processor architects, John Bruno, Manu Gulati and Gerard Williams III, the company came crashing out of the gate with quite considerable goals to revamp the server market with an SoC that would provide ‘A step-function increase in compute performance and power efficiency’. Today NUVIA is putting more data behind those goals.

The press release we received from NUVIA takes some time to cover some of the basics of the modern day server market, and it initially read almost like an AnandTech article, which is eerily scary. Suffice to say, NUVIA understands the current state of play of the server market, including where Intel and AMD stand with respect to each other, and how x86 offerings are squaring up against the other options on the market. As with most elements of the server market, different verticals often have different requirements, on compute, memory, IO, power, or physical constraints, as well as initial cost of hardware alongside total cost of ownership. To that end, NUVIA’s processor designs, according to the company, ‘an SoC that will deliver industry-leading performance with the highest levels of efficiency, at the same time’.

With that, NUVIA is announcing that its first generation CPU core will be called Phoenix and be built upon the ARM architecture (likely Armv9) with an architecture license. Phoenix will be part of the Orion SoC, with NUVIA stating that they are implementing ‘a complete overhaul of the CPU pipeline’. Gerard William’s designs from Apple are known to be considerably different to what we’ve seen elsewhere in the market, so we suspect that this is going to be a big part of the secret sauce behind Orion and its Phoenix cores.

NUVIA goes on to say that Phoenix is ‘a clean sheet design’, focusing on single core performance leadership and maximizing memory bandwidth and utilization. The Orion SoC will be built to focus on high utilization and sustained frequencies, without having to rely on high-turbo marketing numbers, to allow customers to make the best use of the hardware within allocated power and cooling budgets. Alongside this, NUVIA is stating that there will be hardware infrastructure built to specification ‘to support peak performance on real cloud workloads’.

NUVIA’s Numbers

The big part of the press release is NUVIA’s performance-per-watt claims. To do this, NUVIA is using Geekbench5 as a performance indicator, along with direct power measuring, of current in-market x86 and Arm offerings. NUVIA is taking smartphone and mobile based cores, such as Intel Ice Lake, Qualcomm SD865, AMD Ryzen 4700U, as well as Apple’s A12Z Vortex and A13 Lightning, as starting points. The reason for this is that NUVIA believes there is starting to become no meaningful difference between smartphone/mobile cores and server cores when extrapolated – only if you start adding in massive vector engines for specific customers does that become relevant.

According to NUVIA’s numbers, this is where the current market stands with respect to Geekbench 5. At every point, ARM’s results are more power efficient/higher performant than anything available on x86, even though at the high end Apple and Intel are almost equal on performance (for 4x the power on Intel).

NUVIA notes that power of the x86 cores can vary, from 3W to 20W per core depending on the workload, however in the sub 5W bracket, nothing from x86 can come close to the power efficiency of high-performance Arm designs. This is where Phoenix comes in.

NUVIA’s claim is that the Phoenix core is set to offer from +50% to +100% peak performance of the other cores, either for the same power as other Arm cores or for a third of the power of x86 cores. NUVIA’s wording for this graph includes the phrase ‘we have left the upper part of the curve out to fully disclose at a later date’, indicating that they likely intend for Phoenix cores to go beyond 5W per core.

At this point, NUVIA is running simulations of its core designs in-house to get these numbers. This is a standard thing for any company developing a new SoC or a new core before actually going to the fab to get it made. It also helps investors analyze where things stand.

What gives credibility to the new company’s lofty goals is the founder’s track record of their past designs. Apple’s silicon success over the last half decade has been one of the most impressive developments in the industry, and it seems NUVIA has been able to recruit top talent with the aim to reproduce such success in the datacentre market.

Some users might consider that SPEC should have been used, given its relevance to NUVIA’s initial target markets on server, and I perhaps agree. I suspect that NUVIA believed that GB5 might be more accessible to a wider audience for core-to-core comparisons.

The Future

NUVIA states with this press release that it will aim to have some of the highest performance and best efficiency CPU/SoC products in the market. The company reiterates that even if other vendors suddenly see a 20% year-over-year gain in raw performance, NUVIA still expects to be ahead of its main competitors. We shall have to wait and see what magic NUVIA has that others do not.

Update: Initially this article said that NUVIA will have new products in the next 18 months. This was a simple misreading of NUVIA's press release and the relevent sentence has been removed.

Related Reading

Comments Locked

43 Comments

View All Comments

  • FreckledTrout - Tuesday, August 11, 2020 - link

    That is some rather bold claims and frankly I am very skeptical. I am skeptical on comparing to low power cores. I would gladly be surprised but I think we are looking at marketing more than anything. Feels like someone needs additional funding.
  • Quantumz0d - Tuesday, August 11, 2020 - link

    What is that graph lol. Put forth a proper HW silicon in the hands of Phoronix and STH then we can talk until then it's all BS, also GB5 ? double lol.
  • Quantumz0d - Tuesday, August 11, 2020 - link

    Headline is also misleading. Says "Zen 2" but lists out 4700U in the article only, it looks like as if this magical unicorn is going to destroy AMD Zen 2 uArch.
  • name99 - Tuesday, August 11, 2020 - link

    There you are:
    https://browser.geekbench.com/processor-benchmarks

    Doesn't change the overall point!
    You do realize Nuvia considers this Intel/AMD rivalry to be silly nonsense, two dinosaurs fighting while they both ignore the asteroid heading towards them!?!

    The only curve that matters as far as Nuvia is concerned is Apple's curve and Apple's business plans. Apple pro's
    - infrastructure: (testing, engineers, money)
    - existing customers
    But Apple con's
    - now locked into a design flow, and it's always hard to throw that all away and say "let's try something completely different"
    - it's harder for Apple than Nuvia (though not impossibly hard) to move to a new instruction set.

    If we assume Nuvia start at Apple performance levels they can get a win of ~20% just by adding SVE2. Then another, what, 10%? by using ARMv9 rather than ARMv8. Then they get whatever you might expect Apple to get from a generational shift (20..25%), aided this year by the 5nm transition so an easy 10% speed boost and a whole lot of density boost.

    Meanwhile Nuvia have the flexibility that's harder for Apple of going to a completely new pipeline. In particular Nuvia may feel that the time has finally arrived for implementing some sort of KIP (kilo-instruction pipeline). A number of these have been proposed over the past 20 years, and the idea has been successively refined. Apple COULD retrofit various pieces of such a pipeline to what they have today (for all I know they've already started doing this), but Nuvia can get their right away without bothering to retrofit.

    Ultimately I think it's good news for Apple fans in that it will probably persuade Apple to be a little more daring than their natural inclinations (eg maybe set up an alternative team working on "aggressive core ideas"). Which benefits ARM who's goal seems to be "always two years behind Apple, no better but also no worse".

    x86? Well, they made their choice of perpetual compatibility over performance. Now they live with the consequences.
  • anonomouse - Tuesday, August 11, 2020 - link

    "NUVIA’s wording for this graph includes the phrase ‘we have left the upper part of the curve out to fully disclose at a later date’, indicating that they likely intend for Phoenix cores to go beyond 5W per core."

    I think this just means they obfuscated the crap out of the curve, not that they intend to go beyond 5W per core. Their blog post pretty directly states why it didn't make sense for them to really target pushing beyond the realistic use-case power budgets.
  • Colin1497 - Tuesday, August 11, 2020 - link

    Comparing your future product against a competitor's past product will tend to work out in your favor if you're competent, but the target is moving.

    This should be interesting.
  • webdoctors - Tuesday, August 11, 2020 - link

    I'm curious whether they're leveraging any binary translation for their benchmark results. There's many processors floating around that execute ARM code but translate them into their own priorietary mess internally and they only see speedups once the system is warmed up and do poorly outside of the main loops when there's system calls involved.

    Also the belief that something that's perf/watt efficient under 1W or under 5W scales to the 10-20+W is ludicrous. The tradeoffs for scaling in such different domains are enormous when it comes to actual achieved perf.

    I'm skeptical there's much room for innovation in the ARM server front but it'll be great to be proven wrong.
  • anonomouse - Tuesday, August 11, 2020 - link

    I think their point is that (per-core) they are not trying to scale to 10-20+ W.
  • Wilco1 - Tuesday, August 11, 2020 - link

    Only Denver uses binary translation and that's not used in many products, certainly not any Arm servers.

    Graviton 2 and Ampere Altra are certainly proof of innovation - a relatively small team can make a high-end Arm server chip which uses 1-2W per core and outperforms EPYC.
  • abufrejoval - Tuesday, August 11, 2020 - link

    I keep wondering what their secret sauce is...

    With something like Ivan Godard's Mill architecture, I understand how they achieve an order of magnitude more compute performance out the same number of transistors and energy budget: It's quite simply a very clever way of doing things with a DSP inspired ISA that manges to remain general purpose and still my personal favorite, while I'll concede that general purpose has diminishing returns and RISC-V may be better.

    But with a given architecture like ARM, just how much can you do?

    The last architectural doubling of IPC performance I could sort of understand was the VISC design presented here four years ago. That was just a factor of 2 and it came with a very high effort in an area likely more prone than ever to side channel issues.

    But how these new cores can deliver the same general purpose compute power at a fraction of the energy cost on an existing ISA?

    There are really only two avenues that I can see:
    1. use fewer transistors: To my taste that's too much magic and I don't see Apple chips being small
    2. use more transistors but switch them much more slowly (and more aggressively off): At least that seems more likely than 1.

    In any case their approach can't be unique to ARM as an ISA, so I guess we won't know, because once that secret got out, everyone would copy their approach.

    Probably with less success on x86, because the inherent overhead and complexity of the translation layer isn't going away, while its benefits become ever less important.

    But RISC-V or Mill would profit, as would any other ARM if that technology became generalized.

    And I can see how and why they got out of Apple: There is really very little sellable benefit for the additional power on the smartphone.

    On the laptop workstation, much more so, but on the server, energy consumption is king.

    Easy to understand why Tim Cook doesn't like them doing a Jim Keller or going independent. But personally I'd be more interested in a 20GB leak from these guys than from Intel.

Log in

Don't have an account? Sign up now