Last month I took a tour of Oak Ridge National Laboratory and visited the final stages of the assembly of the Titan supercomputer. Titan brings together 18,688 compute nodes, each complete with a 16-core AMD Opteron 6274 CPU and an NVIDIA Tesla K20X GPU. The total core count ends up at 299,008 AMD x86 CPU cores and 50,233,344 NVIDIA GPU cores. Each CPU gets 32GB of DDR3 while each GPU is paired with 6GB of GDDR5 for a total of 710TB of memory in Titan altogether. The entire machine will use as much as 9 megawatts of power under full load.

I described some of the types of applications that will run on Titan in our earlier article, but one of the first applications that Titan was tuned for was the LINPACK benchmark. The Top500 list of world's fastest supercomputers as measured by LINPACK is updated twice a year: in June and November. The Titan upgrade was completed just in time to tune and run LINPACK on the machine and submit an official score. As with any new system there's always the threat of hardware or software issues, but luckily the teams working on Titan were able to upgrade all 18,688 compute nodes and get this system stable and running in time to meet the November deadline for submissions.

Rows of Titan cabinets

The result is very impressive, a score of 17.59 petaflops in the LINPACK benchmark while drawing 8.21MW of power. Titan's first LINPACK score gives it the first place position on the Top500 list of supercomputing sites. Number two on the list is IBM's BlueGene/Q system using PowerPC A2 CPUs and delivering 16.33 petaflops.

Comments Locked

20 Comments

View All Comments

  • Jorange - Monday, November 12, 2012 - link

    I need 17 petaflops in my PC, 2025 I'm waiting!
  • tviceman - Monday, November 12, 2012 - link

    Does anyone know what the BlueGene/Q power draw is when performing the same LINPACK benchmark? Just curious to see what / if any perf/watt improvements Titan makes over BlueGene/Q
  • Jorange - Monday, November 12, 2012 - link

    http://www.top500.org/system/177556

    7890.00 kW
  • Gnarr - Monday, November 12, 2012 - link

    BlueGene/Q 16325TFlop / 7890kW = 2,069kW
    Titan 17590TFlop / 8209kW = 2,143kW
  • Gnarr - Monday, November 12, 2012 - link

    that was supposed to be
    BlueGene/Q 16325TFlop / 7890kW = 2,069 TFlop/kW
    Titan 17590TFlop / 8209kW = 2,143 TFlop/kW
  • tviceman - Monday, November 12, 2012 - link

    Right, thanks. So Titan was barely able to edge out BlueGene/Q. IBM seems like more of a competitor to Nvidia tin HPC han Intel or AMD!
  • Ktracho - Monday, November 12, 2012 - link

    That has been the case for the last several years, but the evidence is not the performance, but rather the number of systems IBM has in the top 10. It will be interesting to see if IBM can keep up. I understand they abandoned a contract they had with Univ. of IL, to Cray's benefit. Also, I wonder how easy it is to program IBM's BlueGene/Q. Anyone know?
  • Death666Angel - Monday, November 12, 2012 - link

    It's still "Flops" though, the s is part of the acronym, otherwise the unit doesn't make any sense. :D
  • Jorange - Monday, November 12, 2012 - link

    According to TOP 500 list BlueGene/Q has a theoretical max of 20 petaflops, but achieves 16 Pflops in Linpack. Titan has a max of 27 Pflops, yet 'only' achieves 17 Pflops in Linpack. It seems much harder to fully utilize the full power of CPU / GPU systems, maybe Intel is right about Xeon Phi.
  • Khato - Monday, November 12, 2012 - link

    Not necessarily - the systems with Xeon Phi are only averaging a similar efficiency of around 65% theoretical. The 'advantage' of Xeon Phi has always been comparative ease of programming, so we'll have to see if that actually plays out or not. The fact that there are already 6 entries in the TOP500 list using it, with Stampede at the #7 spot despite only being ~1/3 population is certainly a good start for a new product line.

Log in

Don't have an account? Sign up now