Around 15 months ago, AMD announced that it would be building 64-bit ARM based SoCs for servers in 2014. Less than a month into 2014, AMD made good on its promise and officially announced the Opteron A1100: a 64-bit ARM Cortex A57 based SoC.

The Opteron A1100 features either 4 or 8 AMD Cortex A57 cores. There's only a single die mask so we're talking about harvested die to make up the quad-core configuration. My guess is over time we'll see that go away entirely, but since we're at very early stages of talking about the A1100 there's likely some hedging of bets going on. Each core will run at a frequency somewhere north of 2GHz. The SoC is built on a 28nm process at Global Foundries.

Each pair of cores shares a 1MB L2 cache, for a total of up to 4MB of L2 cache for the chip. All cores share a unified L3 cache of up to 8MB in size. AMD designed a new memory controller for the Opteron A1100 that's capable of supporting both DDR3 or DDR4. The memory interface is 128-bits wide and supports up to 4 SODIMMs, UDIMMs or RDIMMs. AMD will be shipping a reference platform capable of supporting up to 128GB of Registered DDR3 DIMMs off of a single SoC.

Also on-die is an 8-lane PCIe 3.0 controller (1 x8 or 2 x4 slot configurations supported) and an 8-port 6Gbps SATA controller. AMD assured me that the on-chip fabric is capable of sustaining full bandwidth to all 8 SATA ports. The SoC features support for 2 x 10GbE ports and ARM's TrustZone technology. 

AMD will be making a reference board available to interested parties starting in March, with server and OEM announcements to come in Q4 of this year. 

It's still too early to talk about performance or TDPs, but AMD did indicate better overall performance than its Opteron X2150 (4-core 1.9GHz Jaguar) at a comparable TDP:

AMD Opteron A1100 vs. X2150
  CPU Core Configuration CPU Frequency SPECint_rate Estimate SPECint per Core Estimated TDP
AMD Opteron A1100 8 x ARM Cortex A57 >= 2GHz 80 10 25W
AMD Opteron X2150 4 x AMD Jaguar 1.9GHz 28.1 7 22W

AMD alluded to substantial cost savings over competing Intel solutions with support for similar memory capacities. AMD tells me we should expect a total "solution" price somewhere around 1/10th that of a competing high-end Xeon box, but it isn't offering specifics beyond that just yet. Given the Opteron X2150 performance/TDP comparison, I'm guessing we're looking at a similar ~$100 price point for the SoC. There's also no word on whether or not the SoC will leverage any of AMD's graphics IP.

The Opteron A1100 is aimed squarely at those applications that either need a lot of low power compute or tons of memory/storage. AMD sees huge demand in the memcached space, cold storage servers and Apache web front ends. The offer is pretty simple: take cost savings on the CPU front and pour it into more DRAM.

Early attempts at ARM based server designs were problematic given the lack of a 64-bit ARM ISA. With ARMv8 and the Cortex A53/A57 CPUs, that's all changed. I don't suspect solutions like the Opteron A1100 to be a knockout success immediately, but this is definitely the beginning of something very new. Of all of the players in the ARM enterprise space, AMD looks like one of the most credible threats. It's also a great way for AMD to rebuild its enterprise marketshare with a targeted strike in new/growing segments. 

AMD's Andrew Feldman included one of his trademark reality check slides in his Opteron A1100 presentation today:

Lower cost, high volume CPUs have always won. That's how Intel took the server market to begin with. The implication here is that ARM will do the same to Intel. Predicting 25% of the server market by 2019 may be feasible, but I'm not fond of making predictions for what the world will look like 5 years from now. 

The real question is what architecture(s) AMD plans to use to get to a leadership position among ARM CPUs and a substantial share of the x86 CPU market. We get the first hint with the third bullet above: "smaller more efficient x86 CPUs will be dominant in the x86 segment".

POST A COMMENT

124 Comments

View All Comments

  • name99 - Wednesday, January 29, 2014 - link

    How good is ARM's coherent SMP cluster technology (the AMBA stuff)?
    Does it compare favorably with Intel and IBM's equivalents, or is it rather more amateurish (i.e. a lot more coherency traffic, lower frequency, more verbose transactions, not as scalable, etc etc)?
    Reply
  • mczak - Wednesday, January 29, 2014 - link

    Possible, certainly. But that will require a better interconnect between the clusters (as you have more clusters), and I don't think there's any advantage to a 2-core cluster itself (since it's designed to scale up to 4 cores in the first place).
    Looking through all the news though, I haven't seen any mention of the L2 being shared by just 2 cores elsewhere, and it's not in the presentation itself. I guess we'll see.
    Reply
  • extide - Wednesday, January 29, 2014 - link

    Yeah, I thought that was an interesting configuration, however if you think about it, it seems like a good config for a server. Especially with the limited external memory bandwidth, this configuration, which allows for more L2 overall, may be superior. Reply
  • futrtrubl - Tuesday, January 28, 2014 - link

    "AMD ensured me that the on-chip fabric" "Ensured" should be "assured". Reply
  • Krysto - Wednesday, January 29, 2014 - link

    It seems like MUCH better performance (3x more) than its Jaguar ones at the same TDP.

    That being said, I wish they made them at 20nm, especially since they're coming out in Q4 this year. Sometimes I feel like AMD has some kind of weird fetish about using old weak-sauce process nodes. It's like even when they COULD hit out of the park, they're too afraid to do it, and play the "conversative/cheap" card. It's a shame. I guess they don't feel like they have much greatness in them anymore, and they project that in the market, too.
    Reply
  • gruffi - Wednesday, January 29, 2014 - link

    Fetish? How can you use 20nm when it's not ready yet? You won't see 20nm from Glofo before 2015. New processes are always very expensive. It's better to use known processes for a new design. It minimizes costs and risks. By the way, 28nm from Glofo is quite good. It has a similar density as Intel's 22nm process. And Kaveri shows that power consumption <3 GHz is quite good as well. Reply
  • MrSpadge - Wednesday, January 29, 2014 - link

    AMD has volume contracts with GloFo, and they've just switched to 28 nm. Their 20 nm will take quite some more time. Reply
  • iwod - Wednesday, January 29, 2014 - link

    GF wont have 20nm, it will be straight to 16nm Reply
  • nutjob2 - Friday, January 31, 2014 - link

    I think the fetish is on the Intel side, because they have billions to spend on each node. Just keep in mind who pays for those billions. Reply
  • geoxile - Monday, August 25, 2014 - link

    It's closer to 4x the performance of Jaguar, but how? How are 2x Silvermont cores achieving 4x the score? Avoton chips are clocked higher, sure, but they lose some of that advantage due to lower IPC. Reply

Log in

Don't have an account? Sign up now