Intel Haswell-EP Xeon 14 Core Review: E5-2695 V3 and E5-2697 V3
by Ian Cutress on November 20, 2014 10:00 AM ESTMoving up the Xeon product stack, the larger and more complicated the die, the lower the yield. Intel sells its 14-18 core Xeons from a top end design that weighs in at over five billion transistors, and we have had two of the 14C models in for review: the E5-2695 V3 (2.3 GHz, 3.3 GHz turbo) and E5-2697 V3 (2.6 GHz, 3.6 GHz turbo).
The Information
It can only take one particular transistor to fail to break a whole CPU. If it happens in a core as part of the logic or caches, that core can be fused off and the die can sold as a lower core part. This is how yields are improved, by reusing the dies that have errors in removable sections. Ultimately this reduces the maximum amount of profit on offer, but it enables CPU manufacturers like Intel and AMD to sell a range of products, rather than just one from a single design. The way Intel designs its high end E5 V3 Xeons, from an 18-core die, means that its 14 core components either have at least two defects, or are perfectly fine 18 core models but need to fill up demand.
CPU Specification Comparison | |||||||||
CPU | Node | Cores | GPU | Transistor Count (Schematic) |
Die Size | ||||
Server CPUs | |||||||||
Intel | Haswell-EP 14-18C | 22nm | 14-18 | N/A | 5.69B | 662mm2 | |||
Intel | Haswell-EP 10C-12C | 22nm | 6-12 | N/A | 3.84B | 492mm2 | |||
Intel | Haswell-EP 6C-8C | 22nm | 4-8 | N/A | 2.6B | 354mm2 | |||
Intel | Ivy Bridge-EP 12C-15C | 22nm | 10-15 | N/A | 4.31B | 541mm2 | |||
Intel | Ivy Bridge-EP 10C | 22nm | 6-10 | N/A | 2.89B | 341mm2 | |||
Consumer CPUs | |||||||||
Intel | Haswell-E 8C | 22nm | 8 | N/A | 2.6B | 356mm2 | |||
Intel | Haswell GT2 4C | 22nm | 4 | GT2 | 1.4B | 177mm2 | |||
Intel | Haswell ULT GT3 2C | 22nm | 2 | GT3 | 1.3B | 181mm2 | |||
Intel | Ivy Bridge-E 6C | 22nm | 6 | N/A | 1.86B | 257mm2 | |||
Intel | Ivy Bridge 4C | 22nm | 4 | GT2 | 1.2B | 160mm2 | |||
Intel | Sandy Bridge-E 6C | 32nm | 6 | N/A | 2.27B | 435mm2 | |||
Intel | Sandy Bridge 4C | 32nm | 4 | GT2 | 995M | 216mm2 | |||
Intel | Lynnfield 4C | 45nm | 4 | N/A | 774M | 296mm2 | |||
AMD | Trinity 4C | 32nm | 4 | 7660D | 1.303B | 246mm2 | |||
AMD | Vishera 8C | 32nm | 8 | N/A | 1.2B | 315mm2 |
I mentioned in the 12 core review that Intel can play fast and loose with their binning process, giving customers almost what they desire in terms of performance and power, as long as they are willing to pay that price. The same could be said for the 14-18 core market, but rather than offer a swathe of units, Intel offers around half a dozen ranging from a 2.0 GHz 14-core to the E5-2699 V3 2.3 GHz 18-core. Intel could release a 65W, 18 core monster, and even though it might come through at 1.2 GHz, this type of SKU is not on the roadmap (unless, perhaps, you meet the high minimum order quantity). If given the opportunity, I would like to see the process by which Intel decides to select which SKUs to bin for retail vs. OEM and custom parts. I suspect it is a combination of part market demand, part yield, part wishful thinking, but I would hope it is at least systematic. Based on the core orientation image below, there might be complications dealing with that final column of six cores, against the other columns of four, either in voltage response characteristics or discrete production errors which might also have another effect.
Our samples in today come in with the E5-2695 V3 at 2.3 GHz base frequency (3.3 GHz turbo) and the E5-2697 V3 at 2.6 GHz (3.6 GHz turbo). When considering the Xeon naming stack, each number from 2695 to 2699 is taken except from 2696, and as such one might humorously postulate that Intel is merely running out of SKU names. Though an added L or W might find its way in if more models joined the list.
In our last test, as well as previous reviews, the results showed that a 2P system, such as the dual E5-2650L V3s, performed poorly in most of our testing software compared to having one big single CPU in a 1P socket in most circumstances. The 1P arrangement tends to outperform a 2P system when the software is not built to take advantage of the NUMA arrangement. Intel does sell CPUs like the E5-1691 V3, a 14 core chip for 1P systems, or we can go straight into the E5-2699 V3 for 18 cores, but there will always be a market for 2P players who need the large memory capacity or who use software similar to Cinema 4D that is NUMA aware.
Intel Xeon E5 2600 v3 SKU Comparison | ||||
Xeon E5 | Cores/ Threads |
TDP | Clock Speed (GHz) Base - Turbo |
Price |
High Performance (35-45MB LLC) | ||||
2699 v3 | 18/36 | 145W | 2.3-3.6 | $4115 |
2698 v3 | 16/32 | 135W | 2.3-3.6 | $3226 |
2697 v3 | 14/28 | 145W | 2.6-3.6 | $2702 |
2695 v3 | 14/28 | 120W | 2.3-3.3 | $2424 |
"Advanced" (20-30MB LLC) | ||||
2690 v3 | 12/24 | 135W | 2.6-3.5 | $2090 |
2685 v3 | 12/24 | 120W | 2.6-3.5 | $2090 |
2680 v3 | 12/24 | 120W | 2.5-3.3 | $1745 |
2660 v3 | 10/20 | 105W | 2.6-3.3 | $1445 |
2658 v3 (E) | 12/24 | 105W | 2.2-2.9 | $1832 |
2650 v3 | 10/20 | 105W | 2.3-3.0 | $1167 |
Midrange (15-25MB LLC) | ||||
2640 v3 | 8/16 | 90W | 2.6-3.4 | $939 |
2630 v3 | 8/16 | 85W | 2.4-3.2 | $667 |
2620 v3 | 6/12 | 85W | 2.4-3.2 | $422 |
Frequency optimized (10-20MB LLC) | ||||
2687W v3 | 10/20 | 160W | 3.1-3.5 | $2141 |
2667 v3 | 8/16 | 135W | 3.2-3.6 | $2057 |
2643 v3 | 6/12 | 135W | 3.4-3.7 | $1552 |
2637 v3 | 4/8 | 135W | 3.5-3.7 | $996 |
Budget (15MB LLC) | ||||
2609 v3 | 6/6 | 85W | 1.9 | $306 |
2603 v3 | 6/6 | 85W | 1.6 | $213 |
Power Optimized (20-30MB LLC) | ||||
2650L v3 | 12/24 | 65W | 1.8-2.5 | $1329 |
2648L v3 (E) | 12/24 | 75W | 1.8-2.5 | $1544 |
2630L v3 | 8/16 | 55W | 1.8-2.9 | $612 |
The big cores get a big power budget and a big price to match. The movement from the 2695 to the 2697 is only a few hundred MHz, but Intel charges and additional $278 for the privilege with a rise in 25 TDP. In terms of frequency response both of the CPUs follow the same path, marking an extra 300 MHz for the difference in power and price.
If we did some basic 24/365 100% use calculations, using the TDP and $0.10/kWh, The 2697 V3 would consume 1270 kWh and cost $127/yr compared to the 2695 V3 which would consume 1050 kWh and cost $105/yr. This is obviously not including any additional cooling needed, but the $22 difference in power per year against $278 in the CPU price difference would indicate 15 years of running to make up the difference. Clearly the cost per CPU matters more regarding how much work is going to be done per unit time. If the contract takes less time to complete, then it can sway the preference in terms of the faster CPU if the contract is CPU compute or response bound.
As this the third in our recent series of Xeon E5-2600 v3 performance coverage, we have covered most of the technical data in our previous two installments regarding 10 core and 12 core performance. We carry over the data from those tests, but refer back for details regarding chipset and DRAM information, as well as Johan’s extensive review covering in depth more of the server-focused aspects of the Xeon E5 v3 design.
Test Setup
As with the previous reviews, due to the timing available to test each of our CPU samples we were only able to get a limited range of E5-2695 V3 benchmark results. However, we were able to source two E5-2697 V3 CPUs for dual 14-core analysis leading to a 56-thread behemoth.
Test Setup | |
Processor | Intel Xeon E5-2695 V3 (120W), 14C/28T, 2.3 GHz (3.3 GHz Turbo) Intel Xeon E5-2697 V3 (145W), 14C/28T, 2.6 GHz (3.6 GHz Turbo) |
Motherboards | ASUS X99-Deluxe ASRock X99 Extreme6 GIGABYTE MD60-SC0 |
Cooling | Cooler Master Nepton 140XL Dynatron R14 |
Power Supply | OCZ 1250W Gold ZX Series Corsair AX1200i Platinum PSU |
Memory | ADATA XPG Z1 DDR4-2400 8x8 GB 1.2V Corsair DDR4-2133 C15 4x8 GB 1.2V G.Skill Ripjaws 4 DDR4-2133 C15 4x8 GB 1.2V |
Memory Settings | JEDEC @ 2133 |
Video Cards | AMD R7 240 DDR3 |
Video Drivers | AMD Catalyst 13.11 |
Hard Drive | OCZ Vertex 3 256GB |
Optical Drive | LG GH22NS50 |
Case | Open Test Bed |
Operating System | Windows 7 64-bit SP1 |
Many thanks to...
We must thank the following companies for kindly providing hardware for our test bed:
Thank you to OCZ for providing us with PSUs and SSDs.
Thank you to G.Skill for providing us with memory.
Thank you to Corsair for providing us with an AX1200i PSU.
Thank you to MSI for providing us with the NVIDIA GTX 770 Lightning GPUs.
Thank you to Rosewill for providing us with PSUs and RK-9100 keyboards.
Thank you to ASRock for providing us with some IO testing kit.
Thank you to Cooler Master for providing us with Nepton 140XL CLCs.
Thank you to GIGABYTE Server for loaning us some CPUs and Dynatron CPU coolers.
Load Delta Power Consumption
Power consumption was tested on the system while in a single MSI GTX 770 Lightning GPU configuration with a wall meter connected to the OCZ 1250W power supply. This power supply is Gold rated, and as I am in the UK on a 230-240 V supply, leads to ~75% efficiency > 50W, and 90%+ efficiency at 250W, suitable for both idle and multi-GPU loading. This method of power reading allows us to compare the power management of the UEFI and the board to supply components with power under load, and includes typical PSU losses due to efficiency.
We take the power delta difference between idle and load as our tested value, giving an indication of the power increase from the CPU when placed under stress.
44 Comments
View All Comments
martinpw - Thursday, November 20, 2014 - link
Of interest - Intel did a custom version of the 2698 called the 2698A for Lenovo which pushes the base clock from 2.3 to 2.8GHz. It is water cooled.aka_Warlock - Thursday, November 20, 2014 - link
14 core gaming benchmarks!???!! Wtf?? Who the f*** cares?? Where are the virtualization benches??? Ffs... This is just too god damn poor.wallysb01 - Thursday, November 20, 2014 - link
And I’m pretty sure this has been a main complaint through out this whole E5-2600v3 review, but no action to fix it.SanX - Thursday, November 20, 2014 - link
You take 4790k overclicked to 5GHz and it will win most of single chip benchmarks lolname99 - Thursday, November 20, 2014 - link
"If it happens in a core as part of the logic or caches, that core can be fused off and the die can sold as a lower core part. This is how yields are improved, by reusing the dies that have errors in removable sections."Is this REALLY the case? We've heard this claim for years, but the only case I know of where it's actually been verified is when AMD was selling 3-core CPUs harvested from 4-core dies.
The reason I am dubious is that, if this is such a great system for increasing profits, how come the ONLY place it ever seems to be used is in Intel's server models?
POWER does not sell dies with random numbers of cores. (Or more precisely they talk about 6, 8, 10 and 12 processor variants but, as far as I can tell, each one of these is a separate die. Certainly they have showed the die shots for the 6 processor version.)
ARM does not do this. There are, as far as I know, no ARM phones that save a buck by using three cores harvested from a four core die, even though that would make sense for many markets and many use cases.
Even Intel does not do this outside Xeons. It's the same story as ARM --- given Intel's crazy market segmentation, again there'd certainly be a market for 3-core i7's harvested from 4-core i7's and priced appropriately.
looncraz - Friday, November 21, 2014 - link
From my understanding, some higher models with defects are binned to lower models even if specific dies exist for those lower models.For instance, it could be the case that some i3s are binned i5s, but it would probably be only a few percent of the total number of chips.
chekk - Thursday, November 20, 2014 - link
Power consumption delta is still undesirable. Please Ian, Ryan et al, go back to separate idle and load measurements.kordian - Friday, November 21, 2014 - link
Again, why are there no database scalability benchmarks or virtualization benchmarks? Redis is single-threaded, so of course a high-clocked desktop processor will win - it doesn't take advantage of the main strength of these processors - their huge number of cores. Can we please drop the gaming benchmarks on these and get some benchmarks that reflect the use of a 36 core dual socket box.antialienado - Friday, November 21, 2014 - link
This is an important idea that Anandtech should implement to differentiate itself and add value: Create an Anandtech designation for processors.It could be based on performance, and some common metrics, number of cores, , and many features, but it should be consistent between generations.
Today is easy to get lost. I lost track of many, many meaningless names. Haswell, Ivy Bridge, Sandy Bridge, Vishera, E5-2695 V3 means nothing to me. Too many generations, too many names, conveying too little information.
When I see an article title, saying E5-2697 V3, I don’t get a clue about if this is a portable processor, a desktop, bang for the buck, server, a top processor, review worth of my time, of just a random processor being reviewed just for the sake of making content.
Ok, the title on THIS article says at least that it is a 14 cores Xeon processor, but that is not the point. The point is that ANY article should be clear about what it does review. It does not matter if the title spoils the conclusion by telling the processor performance right on the title. I want information, not a mystery story. The article will explain why it got the designation it got.
There are many dimensions explaining the performance of each processor. Single threaded performance, multi threaded, energy efficient, portable at notebook level, or at tablet/phone level, locked/unlocked, gaming, office, server, workstation, socket compatibility, GPU integrated, etc.
But processor naming make little or no attempt to clarify any of those variables.
I would prefer sockets being referred by year of introduction, features, type of memory, etc; not by another meaningless jungle naming. I already lost track of soo many designations, and each year it gets worse, more varied and confusing.
Anandtech already haves some classifications on GPU and processors based on similar performing. It should be converted into an Anandtech rating, for clarity.
Anandtech should create a naming scheme not making a difference between AMD and Intel when it does not matter (like on performance).
Of course, it would necessarily have unavoidable flaws, but any system would be better that the official naming, and lack of coherency between generations and manufacturer.
Maybe other sites start adopting Anandtech designation.
dave1231 - Friday, November 21, 2014 - link
$0.10 per kwh? Try $0.28 per kWh in the UK then nearly triple the running costs. This is why we are so much better at not global warming.