The new methodology

At Anandtech, giving you real world measurements has always been the goal of this site. Contrary to the vast majority of IT sites out there, we don’t believe in letting some consultant or analyst spell it out for you.  We give you our measurements, as close to the real world as possible. We give you our opinion based on those measurements, but ultimately it is up to you to decide how to interpret the numbers.  You tell us in our comment box if we make a mistake in our thoughts somewhere. And we will investigate it, and get back to you. It is a slow process, but we firmly believe in it. And that is what happened in our article about  “dynamic power management”and “testing low power CPUs”

The former article was written to understand how the current power management techniques work. We needed a very easy, well understood benchmark to keep the complexity down. And it allowed us to learn a lot about the current Dynamic Voltage and Frequency Scaling (DVFS) techniques that AMD and Intel use. But as we admitted, our Fritz Chess benchmark was and is not a good choice if you wanted to apply this new insights to your own datacenter.

“Testing low power CPUs” went much less in depth,  but used a real world benchmark: our vApus Mark I, which simulates a heavy consolidated virtualization load. The numbers were very interesting, but the article had one big shortcoming: it only measured at 90-100% workload or idle. The reason for this is that the vApus benchmark score was based upon throughput. And to measure the throughput of a certain system, you have to stress it close to the maximum. So we could not measure performance accurately unless we went for the top performance. And that is fine for an HPC workload, but not for a commercial virtualization/database/web workload.

Therefore we went for a different approach based upon our reader's feedback. We launched “one tile” of the vApus benchmark on each of tested servers. Such a tile consists of a OLAP database (4 vCPUs), an OLTP database (4 vCPUs) and two web VMs (2 vCPUs). So in total we have 12 virtual CPUs. These 12 virtual CPUs are much less than what a typical high-end dual CPU server can offer. From the point of view of the Windows 2008, Linux or VMware ESX scheduler, the best Xeon 5600 (“Westmere”) and Opteron 6100 (“Magny-cours”) can offer 24 logical or physical cores. To the hypervisor, those logical or physical cores are Hardware Execution Contexts (HECs). The hypervisor schedules VMs onto these HECs.  Typically each of the 12 virtual cores needs somewhere between 50 and 90% of one core. Since we have twice the number of cores or HECs than required, we expect the typical load on the complete system to hover between 25 and 45%.  And although it is not perfect, this is much closer to the real world. Most virtualized servers never run idle for a long time: with so many VMs, there is always something to do. System administrators also want to avoid CPU loads over 60-70% as this might make the response time go up exponentially.

There is more. Instead of measuring throughput, we focus on response time. At the end of the day, the number of pages that your server can maximally serve is nice to know, but not important. The response time that your system offers at a certain load is much more important. Users will appreciate low response times. Nobody is going to be happy about the fact that your server can serve up to 10.000 request per second if each page takes 10 seconds to load.

Lowering the energy costs Hardware configuration and measuring power
Comments Locked

49 Comments

View All Comments

  • duploxxx - Friday, July 16, 2010 - link

    It is a very nice test, but always those tests are done with extreme low or extreme high bins while the mass isn't buying these parts, In the field you rather see huge piles of for example E5620-30 series and X5650, result differences should be interesting to see or at least for once provide an idea about the scaling in portfolio. (I know this is a difficult one)

    Although It is very interesting to see the differences within vendors or between vendors there is one major thing that is missing in all these tests. This review will provide any interested IT a good overview what he could do with power consumption and the performance but what is left out here is the big influence from OEM.

    OEM have there own powersavings/regulators/dynamics that influence the server a lot both in OS and BIOS, even often in a very bad way. So while it is an interesting article most IT will never get the result they wanted due to the OEM implementation.
  • indiamap - Friday, July 16, 2010 - link

    I do agree 100% with the author. Johan is very much clear on what he is more concerned about, the electricity bill. It’s very much true that electricity bills eat up large amount of income of <a href="http://www.indiamaphosting.com/">Internet Marketing India</a> companies and it’s a huge burden from business point of view. This can be tackled by opting for better hardware's which is compatible with green energy and helps in cutting the electricity bill extensively and saves huge sum of money for the organization in the long run.
  • indiamap - Friday, July 16, 2010 - link

    This is what the internet companies were looking for. Cost Cutting. It is very essential from business POV. Lesser the energy consumption, more the investment on product research. Great numbers exposed. Thanks a lot Johan. I wonder this is why Google started electricity generating stations to power its massive data centers.

    For more information, please visit: http://www.indiamaphosting.com/
  • mino - Friday, July 16, 2010 - link

    Johan - great article.

    Keep it up !
  • Whizzard9992 - Monday, July 19, 2010 - link

    Can we please clean up the spam here? Where's the "REPORT" button?
  • Toadster - Friday, July 16, 2010 - link

    Given the results you've found it would be great to see how power capping can influence the workloads as well. The latest DELL Poweredge-C platforms support Intel Intelligent Power Node Manager technology - would be fantastic to have a look!

    Keep up the great work - excellent article!
  • Casper42 - Saturday, July 17, 2010 - link

    So first off I have to say that using a home built Machine with an Asus mobo and trying to talk intelligently about Datacenter Power is not really a fair comparison. Stick to the big 3 (HP/Dell/IBM) when doing these kinds of comparisons.

    Now the real reason I posted was because you mentioned the speed of the memory you used, but made NO mention of the speed the memory was actually running at.

    With the Nehalem and Westmere Xeons, only the X series can run the memory at 1333 while the others (E and L series) start at 1066. When you run more than one bank of memory you can also see your memory frequency decline depending on the server vendor you are using. I think HP has a bit you can flip in their machines that will allow you to run 2 banks @ 1333 (again, assuming X Series proc) but if you don't turn that on, you step down from 1 bank @ 1333 to 2 @ 1066 and even 3 @ 800.

    The reason I bring this up is because you said yourself your machine was NOT CPU bound, and you weren't entirely sure why the tests completed with such different times. Well memory performance could be part of that equation.

    Lastly, you have to remember that not every server in a DC is running VMWare/HyperV and there are still tons of servers with basic Windows or Linux App/Web workloads running right on the hardware. These kind of servers on average in the industry run less than 10% of the CPU max in a given day (might be spikes for backups and other jobs, but the average is <10%)
    So if you had a rack with 20 2U Servers and you didn't need VMWare/SQL/Oracle level performance in those racks, why not run them with L series processors and across an entire rack you are saving a decent amount of power.

    PS: Where are you guys at AT located? Your "About Us" button up top has been useless for quite some time now. Not sure it could be pulled off, but you should really look into asking the big 3 for demo gear. Getting a Nehalem EX right now is damn near impossible but a Westmere EP would be doable. The problem here is they do loaners to get sales, not to get reviews, so what you really need to do is find some friends who work in IT at very large companies in your area who would be willing to let you get some wrench time on their demo equipment. 60-90 day loans are quite common.

    -Casper42
  • tjohn46 - Tuesday, July 20, 2010 - link

    I'm surprised I haven't seen anyone else make a similar comment yet: I've been curious about this for a long time, but I would rather see a comparison between 2 CPUs that are intended to be competitive.

    It looks Intel changed things a bit with 5600 series xeons, but previously (including with the 5500's) intel would match up model numbers, cores, and clock speeds. The model with the 'L' prefix would just have a lower TDP. I was always curious if those performed just as well or not?

    For example:

    E5520 vs L5520
    E5530 vs L5530

    I see AMD also has some 80W and 65W comparable models if you do end up testing opterons.

    That would be the real "is it worth the processor price premium?" question in my opinion. Of course the high end part which doesn't have a comparable "low power" model is going to perform better.. but like someone else said, a typical data center tends to have many more of the midrange parts (like an E5530) installed which also have lower power conterparts at a $200ish premium.
  • eva2000 - Saturday, July 31, 2010 - link

    interesting to see how they compare when it comes to linux OS i.e. centos 5.5 or redhat 5.5 :)

Log in

Don't have an account? Sign up now