In 2014/2015, it took NVIDIA 6 months from the launch of the Maxwell 2 architecture to get GTX Titan X out the door. All things considered, that was a fast turnaround for a new architecture. However now that we’re the Pascal generation, it turns out NVIDIA is in the mood to set a speed record, and in more ways than one.

Announced this evening by Jen-Hsun Huang at an engagement at Stanford University is the NVIDIA Titan X, NVIDIA’s new flagship video card. Based on the company’s new GP102 GPU, it’s launching in less than two weeks, on August 2nd.

NVIDIA GPU Specification Comparison
  NVIDIA Titan X GTX 1080 GTX Titan X GTX Titan
CUDA Cores 3584 2560 3072 2688
Texture Units 224? 160 192 224
ROPs 96? 64 96 48
Core Clock 1417MHz 1607MHz 1000MHz 837MHz
Boost Clock 1531MHz 1733MHz 1075MHz 876MHz
TFLOPs (FMA) 11 TFLOPs 9 TFLOPs 6.6 TFLOPs 4.7 TFLOPs
Memory Clock 10Gbps GDDR5X 10Gbps GDDR5X 7Gbps GDDR5 6Gbps GDDR5
Memory Bus Width 384-bit 256-bit 384-bit 384-bit
VRAM 12GB 8GB 12GB 6GB
FP64 1/32 1/32 1/32 1/3
FP16 (Native) 1/64 1/64 N/A N/A
INT8 4:1 ? ? ?
TDP 250W 180W 250W 250W
GPU GP102 GP104 GM200 GK110
Transistor Count 12B 7.2B 8B 7.1B
Die Size 471mm2 314mm2 601mm2 551mm2
Manufacturing Process TSMC 16nm TSMC 16nm TSMC 28nm TSMC 28nm
Launch Date 08/02/2016 05/27/2016 03/17/2015 02/21/2013
Launch Price $1200 MSRP: $599
Founders $699
$999 $999

Let’s dive right into the numbers, shall we? The NVIDIA Titan X will be shipping with 3584 CUDA cores. Assuming that NVIDIA retains their GP104-style consumer architecture here – and there’s every reason to expect they will – then we’re looking at 28 SMs, or 40% more than GP104 and the GTX 1080.

It’s interesting to note here that 3584 CUDA cores happens to be the exact same number of CUDA cores also found in the Tesla P100 accelerator. These products are based on very different GPUs, but I bring this up because Tesla P100 did not use a fully enabled GP100 GPU; its GPU features 3840 CUDA cores in total. NVIDIA is not confirming the total number of CUDA cores in GP102 at this time, but if it’s meant to be a lightweight version of GP100, then this may not be a fully enabled card. This would also maintain the 3:2:1 ratio between GP102/104/106, as we saw with GM200/204/206.

On the clockspeed front, Titan X will be clocked at 1417MHz base and 1531MHz boost. This puts the total FP32 throughput at 11 TFLOPs (well, 10.97…), 24% higher than GTX 1080. In terms of expected performance, NVIDIA isn’t offering any comparisons to GTX 1080 at this time, but relative to the Maxwell 2 based GTX Titan X, they are talking about an up to 60% performance boost.

Feeding the beast that is GP102 is a 384-bit GDDR5X memory bus. NVIDIA will be running Titan X’s GDDR5X at the same 10Gbps as on GTX 1080, so we’re looking at a straight-up 50% increase in memory bus size and resulting memory bandwidth, bringing Titan X to 480GB/sec.

At this point in time there are a few unknowns about other specifications of the card. ROP count and texture unit count have not been disclosed (and this is something NVIDIA rarely posts on their site anyhow), but based on GP104 and GP106, I believe it’s safe to assume that we’re looking at 224 texture units and 96 ROPs respectively. To put this into numbers then, theoretical performance versus a GTX 1080 would be 24% more shading/texturing/geometry/compute performance, 50% more memory bandwidth, and 33% more ROP throughput. Or relative GTX Titan X (Maxwell 2), 56% more shading/texturing/geometry/compute performance, 43% more memory bandwidth, and 42% more ROP throughput. Of course, none of this takes into account any of Pascal’s architectural advantages such as a new delta color compression system.

Meanwhile like the past Titans, the new Titan X is a 250W card, putting it 70W (39%) above GTX 1080. In pictures released by NVIDIA and confirmed by their spec sheet, this will be powered by the typical 8-pin + 6-pin power connector setup. And speaking of pictures, the handful of pictures released so far confirm that the card will be following NVIDIA’s previous reference design, in the new GTX 1000 series triangular style. This means we’re looking at a blower based card – now clad in black for Titan X – using a vapor chamber setup like the GTX 1080 and past Titan cards.

The TDP difference between Titan X and GTX 1080 may also explain some of rationale behind the performance estimates above. In the Maxwel 2 generation, GTX Titan X (250W) consumed 85W more than GTX 980 (165W); but for the Pascal generation, NVIDIA only gets another 70W. As power is the ultimate factor limiting performance, it stands to reason that NVIDIA can't increase performance over GTX 1080 (in the form of CUDA cores and clockspeeds) by as much as they could over GTX 980. There is always the option to go above 250W - Tesla P100 in mezzanine form goes to 300 W - but for a PCIe form factor, 250W seems to be the sweet spot for NVIDIA.

Moving on, display I/O is listed as DisplayPort 1.4, HDMI 2.0b, and DL-DVI; NVIDIA doesn’t list the number of ports (and they aren’t visible in product photos), but I’d expect that it’s 3x DP, 1x HDMI, and 1x DL-DVI, just as with the past Titan X and GTX 1080.

From a marketing standpoint, it goes without saying that NVIDIA is pitching the Titan X as their new flagship card. What is interesting however is that it’s not being classified as a GeForce card, rather it’s the amorphous “NVIDIA Titan X”, being neither Quadro, Tesla, nor GeForce. Since the first card’s introduction in 2013, the GTX Titan series has always walked a fine line as a prosumer card, balanced between a relatively cheap compute card for workstations, and an uber gaming card for gaming PCs.

That NVIDIA has removed this card from the GeForce family would seem to further cement its place as a prosumer card. On the compute front the company is separately advertising the card's 44 TOPs INT8 compute performance - INT8 being frequently used for neural network inference - which is something they haven't done before for GeForce or Titan cards. Though make no mistake: the company’s GeForce division is marketing the card and it’s listed on GeForce.com, so it is still very much a gaming card as well.

As for pricing and availability, NVIDIA’s flagships have always been expensive, and NVIDIA Titan X even more so. The card will retail for $1200, $200 more than the previous GTX Titan X (Maxwell 2), and $500 more than the NVIDIA-built GTX 1080 Founders Edition. Given the overall higher prices for the GTX 1000 series, this isn’t something that surprises me, but none the less it means buying NVIDIA’s best card just got a bit more expensive. Meanwhile for distribution, making a departure from previous generations, the card is only being sold directly by NVIDIA through their website. The company’s board partners will not be distributing it, though system builders will still be able to include it.

Overall the announcement of this new Titan card, its specifications, and its timing raises a lot of questions. Does GP102 have fast FP64/FP16 hardware, or is it purely a larger GP104, finally formalizing the long-anticipated divide between HPC and consumer GPUs? Just how much smaller is GP102 versus GP100? How has NVIDIA been able to contract their launch window by so much for the Pascal generation, launching 3 GPUs in the span of 3 months? These are all good questions I hope we’ll get an answer to, and with an August 2nd launch it looks like we won’t be waiting too long.

Update 07/25: NVIDIA has given us a few answers to the question above. We have confirmation that the FP64 and FP16 rates are identical to GP104, which is to say very slow, and primarily there for compatibility/debug purposes. With the exception of INT8 support, this is a bigger GP104 throughout.

Meanwhile we have a die size for GP102: 471mm2, which is 139mm2 smaller than GP100. Given that both (presumably) have the same number of FP32 cores, the die space savings and implications are significant. This is as best of an example as we're ever going to get on the die space cost of the HPC features limited to GP100: NVLInk, fast FP64/FP16 support, larger register files, etc. By splitting HPC and graphics/inference into two GPUs, NVIDIA can produce GP102 at what should be a significantly lower price (and higher yield), something they couldn't do until the market for compute products based on GP100 was self-sustaining.

Finally, NVIDIA has clarified the branding a bit. Despite GeForce.com labeling it "the world’s ultimate graphics card," NVIDIA this morning has stated that the primary market is FP32 and INT8 compute, not gaming. Though gaming is certainly possible - and I fully expect they'll be happy to sell you $1200 gaming cards - the tables have essentially been flipped from the past Titan cards, where they were treated as gaming first and compute second. This of course opens the door to a proper GeForce branded GP102 card later on, possibly with neutered INT8 support to enforce the market segmentation.

Comments Locked

228 Comments

View All Comments

  • Rock1m1 - Friday, July 22, 2016 - link

    Due to the lack of fanfare, I am guessing even Nvidia does not shy away from the fact that the target audience is very small for this card. Also they should have called it Titan X 2016 or Titan X2?
  • JamesAnthony - Friday, July 22, 2016 - link

    While this card is hugely faster than the original Titan at lots of things, I'm guessing that it's going to be the same 1/32 compute performance.

    So I guess for the actual jobs that need it, I'd be better off keeping my original Titan that is 1/3 and possibly seeing how to run that as a secondary compute card and put in a 1080 or wait for a 1080ti for the gaming part.

    I wish Nvidia would just simply stop playing all these games and let us have at least 1 card that can actually do everything, even if it is around the $3k to $4k mark, just stop screwing around with us and let the highest end Quadro cards do full gaming as well as full compute, it's not like their drivers couldn't have different profile modes.. but no they think you will buy an M6000, a K80 and a 1080... nope... but I bet if the highest end Quadro cards did it all, they would actually get more money overall.
  • Yojimbo - Friday, July 22, 2016 - link

    Being that it only has 12 GB of RAM and that they are pricing it at $1200, I think it could have FP16x2 units. The Tesla M40 is available with 24 GB of RAM and the successor to the M40 should have at least that amount. That allows a decent amount of market differentiation, I think. The question is does have FP16x2? In terms of deep learning, they seem to be pushing this as an inference card judging by the 44 TOPS int8 spec they list. The M40 is marketed as a training card, and for that they would want FP16x2. Do they plan on using a GP100 chip in a card to replace the M40? If not, then GP102 should have FP16x2, unless they don't plan on giving that market segment as large of a speed boost as they are able to.

    I wish I knew how many transistors ROPs and the register files used. Then it would be possible to tell if GP102 likely has DP cores on it, but I'm guessing it doesn't. It has the same number of SP cores as the cut-down GP100 in the P100 but using 3.3 billion less transistors (it may use a fully enabled die, however, which the GP100 in the P100 doesn't). I'm assuming the GP100 does not have ROPs but has double the register files of the GP102.
  • Jackie60 - Friday, July 22, 2016 - link

    Nvidia are greedy fuckers but AMD are useless arseholes therefore we get this pricing, it's annoying but it's what you get from a capitalist system with little competition. Looking forward to the 1080Ti though. I suspect Nvidia are trying to hoover up sales before AMD drop big Vega which may be sooner than we think.
  • D. Lister - Friday, July 22, 2016 - link

    Woah, the sodium content is unhealthily high in this comment section. It is like I wandered into a bloody salt mine. :D

    Where in the woodwork were all these frugal consumers hiding when the "Pro Duo" was announced for $1500 a little while back (April 26th '16, here at AT)? A dual-GPU card, with 4GB usuable VRAM, a 350W TDP, and a performance of <85%* (i.e., when Crossfire is working, otherwise <50%) of this single-GPU, 12GB VRAM, 250W TDP product.

    Wait, they weren't ALL hiding. Some of them were justifying the price tag by calling it a "content creation" card... one of course, with only 4GB VRAM. But hey, it is HBM so that's kewl, right? <hyperbole alert> And in a glorious future, DX12 will make it a full 8GB, and Vulcan will make it go faster than a rocket on crack, and 200 years from now it will be beating God himself in the AoTS bench, 'cuz AMD stuff gets better with time (duh). 1000 years from now, all AMD GPUs will unite to form a whole new God that all will fear, and all images of Jesus will have him look like Raja Koduri wearing a "Gaming Evolved(tm)" t-shirt. While only a few years from now, this Nvidia GPU will be obsolescence-d into the very depths of hell, and will probably jump out of your system eventually and rape your dog, and so on and so forth. Yeah sure, keep at it guys - don't ever let facts slow you down :P. If only the actual employees of AMD were as dedicated to the brand as you are, the company would be unstoppable.

    * - estimated values based on specs, though I wouldn't be off by more than +/-5%.
  • roc1 - Friday, July 22, 2016 - link

    Hilarious. And true :-)
  • K_Space - Friday, July 22, 2016 - link

    Does the Pro Duo still sells? Even the red fans thought it was too expensive and out right dumb to buy for gaming given the ill timing (if we are reading the same comments section). I sport a 2x 295 and no way would I recommend it to anyone given how 16nm was just around the corner. Daniel himself called it out for commanding $500 premium over 2x Nano (although I havent seen the same re this vs a very sensible 2x 1070).
    I agree with the sentiment regarding the rabid fanboyism here. When I first visited the site in 2006 the comment section was much smaller but had as useful content as the article, and I learnt a lot. I guess it's the price we all pay for a wider demographic. Thanks to all the sparks that keep pumping some sense and value to the comment section.
    Can anyone point me in the direction of a useful article detailing nVidia core naming? I find the GP100, 104, 106 all confusing and Ryan Roadmap (http://www.anandtech.com/show/7900/nvidia-updates-... didn't really get into it at all?
  • K_Space - Friday, July 22, 2016 - link

    Please ignore, I'm reminded that Ryan mentioned in the latest 1080 review that he'll be detailing it in a separate article. Looking forward to it.
  • K_Space - Friday, July 22, 2016 - link

    In fact totally ignore, the section on FP16 Throughput on GP104 has completely answered my questions. Thanks Ryan
  • MarkieGcolor - Monday, July 25, 2016 - link

    I've had nano crossfire for a year. It costed me $1000. It still beats a 1080 in Time Spy. Not a bad investment for me IMO. The pro duo was way too late

Log in

Don't have an account? Sign up now