NVIDIA Ships First Volta-based DGX Systems
by Nate Oh on September 7, 2017 10:00 AM EST- Posted in
- GPUs
- Tesla
- NVIDIA
- Volta
- Machine Learning
- GV100
- Deep Learning
This Wednesday, NVIDIA has announced that they have shipped their first commercial Volta-based DGX-1 system to the MGH & BWH Center for Clinical Data Science (CCDS), a Massachusetts-based research group focusing on AI and machine learning applications in healthcare. In a sense, this serves as a generational upgrade as CCDS was one of the first research institutions to receive a Pascal-based first generation DGX-1 last December. In addition, NVIDIA is shipping a DGX Station to CCDS later this month.
At CCDS, these AI supercomputers will continue to be used in training deep neural networks for the purpose of evaluating medical images and scans, using Massachusetts General Hospital’s collection of phenotypic, genetics, and imaging data. In turn, this can assist doctors and medical practitioners in making faster and more accurate diagnoses and treatment plans.
First announced at GTC 2017, the DGX-1V server is powered by 8 Tesla V100s and priced at $149,000. The original iteration of the DGX-1 was priced at $129,000 with a 2P 16-core Haswell-EP configuration, but has since been updated to the same 20-core Broadwell-EP CPUs found in the DGX-1V, allowing for easy P100 to V100 drop-in upgrades. As for the DGX Station, this was also unveiled at GTC 2017, and is essentially a full tower workstation 1P version of the DGX-1 with 4 Tesla V100s. This water cooled DGX Station is priced at $69,000.
Selected NVIDIA DGX Systems Specifications | ||||||
DGX-1 (Volta) |
DGX-1 (Pascal) |
DGX-1 (Pascal, Original) |
DGX Station | |||
GPU Configuration | 8x Tesla V100 | 8x Tesla P100 | 4x Tesla V100 | |||
GPU FP16 Compute | General Purpose | 240 TFLOPS | 170 TFLOPS | 120 TFLOPS |
||
Deep Learning | 960 TFLOPS | 480 TFLOPS | ||||
CPU Configuration | 2x Intel Xeon E5-2698 v4 (20-core, Broadwell-EP) |
2x Intel Xeon E5-2698 v3 (16 core, Haswell-EP) |
1x Intel Xeon E5-2698 v4 (20-core, Broadwell-EP) |
|||
System Memory | 512 GB DDR4-2133 (LRDIMM) |
256 GB DDR4 (LRDIMM) |
||||
Total GPU Memory | 128 GB HBM2 (8x 16GB) |
64 GB HBM2 (4x 16GB) |
||||
Storage | 4x 1.92 TB SSD RAID 0 | OS: 1x 1.92 TB SSD Data: 3x 1.92 TB SSD RAID 0 |
||||
Networking | Dual 10GbE 4 InfiniBand EDR |
Dual 10Gb LAN | ||||
Max Power | 3200W | 1500W | ||||
Dimensions | 866mm x 444mm x 131mm (3U Rackmount) |
518mm x 256mm x 639mm (Tower) |
||||
Other Features | Ubuntu Linux Host OS DGX Software Stack (see Datasheet) |
Ubuntu Desktop Linux OS DGX Software Stack (see Datasheet) 3x DisplayPort |
||||
Price | $149,000 | Varies | $129,000 | $69,000 |
Taking a step back, this is a continuation of NVIDIA’s rollout of Volta-based professional/server products, with DGX Volta meeting its Q3 launch date, and OEM Volta targeted at Q4. In the past months, the first Tesla V100 GPU accelerators were given out to researchers at the 2017 Conference on Computer Vision and Pattern Recognition (CVPR) in July, while a PCIe version of the Tesla V100 was formally announced during ISC 2017 in June.
Source: NVIDIA
48 Comments
View All Comments
bug77 - Thursday, September 7, 2017 - link
In the light of the Vega release, Nvidia has postponed all consumer Volta products to 2018. And even then, I expect we'll see high-end around March and mid-end God knows when. Could be anything between June and holidays season.Drumsticks - Thursday, September 7, 2017 - link
When was it ever postponed? Who says consumer volta (presumably GV104) was ever intended for 2017?Nvidia has never (in recent history at least, to my knowledge) launched two flagship parts that "replace" each other in the same year. The Titans get updated about once a year, and a G"X" 104/102 generally comes out once a year. The ONLY exception was the GTX 780 and 780 Ti, which both came out in 2013, but that doesn't really count since it was the same chip (GK110), the GTX 780 was just very disabled. I don't think we've ever seen a flagship level x80 or x80 Ti replaced in the same year it launched, so it seems kind of strange to expect Nvidia to do it now with a brand new architecture.
vladx - Thursday, September 7, 2017 - link
Exactly, people claiming that stuff above about Volta are mostly AMD fans looking to direct attention from AMD's failures to Nvidia.Hurr Durr - Thursday, September 7, 2017 - link
Weird strategy, considering how much nV is steamrolling them.Zingam - Thursday, September 14, 2017 - link
After years of AMD I got NVIDIA once again and I was hailed by critical driver bugs immediately. Whose fans and whose failures?DanNeely - Thursday, September 7, 2017 - link
Where're the CPUs on that board? The 8 copper heat sinks are presumably the 8x V100 chips. The only other sinks I see are the 4 at the front; and those look way too small to cool a CPU even aside from the spec table saying there're only 2 CPUs not 4 in the system.Nate Oh - Thursday, September 7, 2017 - link
Like the Pascal-based DGX-1s, the CPUs are on its own board, connected by IB EDR.[1][2] Photos of other components were not shown in the CCDS PR photos.[1] http://www.anandtech.com/show/10229/nvidia-announc...
[2] http://images.anandtech.com/doci/10229/DGX1Parts.j...
Nate Oh - Thursday, September 7, 2017 - link
Whoops, I didn't meant to say IB EDR connects the CPU board to the GPU board.Arbie - Friday, September 8, 2017 - link
Bet you wish this forum had "edit" buttons...Bullwinkle J Moose - Thursday, September 7, 2017 - link
Is there a disconnect in the general public understanding of this tech?For example....
Quote
"At CCDS, these AI supercomputers will continue to be used in training deep neural networks for the purpose of evaluating medical images and scans"
O.K. but once the data is in hand, will doctors be able to access the power of all that training with general purpose software on any standard PC?
I was very impressed with the power of A.I. that can be had for free recently when I downloaded a trial copy of iZotope RX6
An offline dualcore 35 Watt Sandy Bridge can do audio processing now that was impossible just a year ago (Go watch some Youtube Videos)
Once the training data is in hand, end users can access that data with properly coded software on any PC without any need for these new systems
The initial cost for these systems (as I understand it) is to get the training data needed
But once the data is available, a general purpose PC can access that training data in the middle of nowhere and without an Internet connection as long as they have the software capable of properly using that data
Is that correct?
In other words, will the true power of all this training data be available to the masses with properly coded software to make a better World, or will greedy Corporations hoard the data so they can play God and profit at the expense of everyone else ?