NVIDIA Unveils “Titan RTX” Video Card: $2500 Turing Tensor Terror Out Later This Month

Name: NVIDIA Unveils “Titan RTX” Video Card: $2500 Turing Tensor Terror Out Later This Month
Item: NVIDIA Unveils “Titan RTX” Video Card: $2500 Turing Tensor Terror Out Later This Month
Author: Ryan Smith

by Ryan Smith on December 3, 2018 8:00 AM EST

83 Comments | Add A Comment

83 Comments

By this point we’ve seen most of NVIDIA’s 2018 Turing GPU product stack. After kicking things off with the Quadro RTX series, NVIDIA released a trio of consumer GeForce RTX cards, and following that the first Turing Tesla, the T4. However as regular industry watchers are well aware, NVIDIA typically does one more high-end card in their product stack, and that’s the ever-popular Titan. Not quite a flagship card and not really a consumer card, the Titan none the less holds an interesting spot in NVIDIA’s lineup as the fastest card most mere mortals can get their hands on, and these days as NVIDIA’s prime workstation compute card.

Last year around this time we saw the launch of the Titan V at the Neural Information Processing Systems (NeurIPS) conference. It seems like that went well for the company, as they’ve once again picked that venue for the launch of their latest Titan card, the aptly named Titan RTX. Set to hit the streets a bit later this month, the card is set to be NVIDIA’s big bruiser for workstation compute and ray tracing users – and anyone else who wants to throw down $2500 for a video card.

NVIDIA Compute Accelerator Specification Comparison
	Titan RTX	Titan V	RTX 2080 Ti Founders Edition	Tesla V100 (PCIe)
CUDA Cores	4608	5120	4352	5120
Tensor Cores	576	640	544	640
Core Clock	1350MHz	1200MHz	1350MHz	?
Boost Clock	1770MHz	1455MHz	1635MHz	1370MHz
Memory Clock	14Gbps GDDR6	1.7Gbps HBM2	14Gbps GDDR6	1.75Gbps HBM2
Memory Bus Width	384-bit	3072-bit	352-bit	4096-bit
Memory Bandwidth	672GB/sec	653GB/sec	616GB/sec	900GB/sec
VRAM	24GB	12GB	11GB	16GB
L2 Cache	6MB	4.5MB	5.5MB	6MB
Single Precision	16.3 TFLOPS	13.8 TFLOPS	14.2 TFLOPS	14 TFLOPS
Double Precision	0.51 TFLOPS	6.9 TFLOPS	0.44 TFLOPS	7 TFLOPS
Tensor Performance (FP16 w/FP32 Acc)	130 TFLOPS	110 TFLOPS	57 TFLOPS	112 TFLOPS
GPU	TU102 (754mm2)	GV100 (815mm2)	TU102 (754mm2)	GV100 (815mm2)
Transistor Count	18.6B	21.1B	18.6B	21.1B
TDP	280W	250W	260W	250W
Form Factor	PCIe	PCIe	PCIe	PCIe
Cooling	Active	Active	Active	Passive
Manufacturing Process	TSMC 12nm FFN	TSMC 12nm FFN	TSMC 12nm FFN	TSMC 12nm FFN
Architecture	Turing	Volta	Turing	Volta
Launch Date	12/2018	12/07/2017	09/20/2018	Q3'17
Price	$2499	$2999	$1199	~$10000

By the numbers, the Titan RTX looks a lot like a more powerful GeForce RTX 2080 Ti. And while it’s not nearly as consumer-focused, this is certainly the most relatable way to look at it. The card is based on the same TU102 GPU as NVIDIA’s consumer flagship, but while the RTX 2080 Ti used a slightly cut-down version of the GPU, Titan RTX gets a fully enabled chip, similar to NVIDIA’s best Quadro cards. Indeed along with the GeForce comparisons, the card is also functionally very close to the Quadro RTX 6000. Which is to say that while the Titan RTX doesn’t really fall under the category of a flagship, it’s not a second-tier card: it’s as powerful and as fast as NVIDIA’s best TU102 cards, so it’s very much at the top of its game.

Looking at its place in the market, with the launch of the Titan V last year, NVIDIA shifted away from the idea of a “prosumer” Titan that was closer to a GeForce with more memory and slightly higher performance, and more towards the idea of a straight-up professional grade workstation card for non-graphics tasks. Using a cut-down version of the server-grade GV100 GPU, Titan V filled this spot nicely, though it did come with some of the baggage that a server-grade GPU entails. Now that NVIDIA is back to using something closer to a workstation-grade GPU in the TU102, NVIDIA has once again shifted the balance between their cards a bit. But the Titan RTX remains the company’s workstation compute card, and thanks to the Turing architecture’s ray-tracing capabilities, is also now being pitched as a ray-tracing card for content creators.

Drilling a bit deeper, there are really three legs to Titan RTX that sets it apart from NVIDIA’s other cards, particularly the GeForce RTX 2080 Ti. Raw performance is certainly once of those; we’re looking at about 15% better performance in shading, texturing, and compute, and around a 9% bump in memory bandwidth and pixel throughput.

However arguably the lynchpin to NVIDIA’s true desired market of data scientists and other compute users is the tensor cores. Present on all NVIDIA’s Turing cards and the heart and soul of NVIIDA’s success in the AI/neural networking field, NVIDIA gave the GeForce cards a singular limitation that is none the less very important to the professional market. In their highest-precision FP16 mode, Turing is capable of accumulating at FP32 for greater precision; however on the GeForce cards this operation is limited to half-speed throughput. This limitation has been removed for the Titan RTX, and as a result it’s capable of full-speed FP32 accumulation throughput on its tensor cores.

NVIDIA Turing Tensor Core Relative Performance
	Titan	GeForce	Quadro	Titan (Volta)
FP16 w/FP32 Accumulate	1x	0.5x	1x	1x
FP16 w/FP16 Accumulate	1x	1x	1x	1x
INT8	1x	1x	1x	N/A
INT4	1x	1x	1x	N/A

Given that NVIDIA’s tensor cores have nearly a dozen modes, this may seem like an odd distinction to make between the GeForce and the Titan. However for data scientists it’s quite important; FP32 accumulate is frequently necessary for neural network training – FP16 accumulate doesn’t have enough precision – especially in the big money fields that will shell out for cards like the Titan and the Tesla. So this small change is a big part of the value proposition to data scientists, as NVIDIA does not offer a cheaper card with the chart-topping 130 TFLOPS of tensor performance that Titan RTX can hit.

Similarly, the final leg for the Titan RTX is memory capacity. Whereas the GeForce RTX 2080 Ti is an 11GB card, Titan RTX is a 24GB card. For gamers even 11GB is generally overkill, however the extra 13GB of VRAM can make or break a large dataset. NVIDIA knows their market very well, and as we’ve seen time and time again, has market segmentation down to a fine art.

Market positioning aside, the launch of the Titan RTX also means that the rest of the tensor performance benefits are finally coming to a Titan-level card. Turing introduced support for lower precision modes, which help to further set apart the Titan RTX from last year’s Titan V. Overall, data scientists who would otherwise be looking at a Titan V are looking at a doubling in VRAM capacity, a 20% improvement in tensor performance – with far more at lower precisions – and all the other improvements of the Turing architecture. And if that’s not enough, NVIDIA is also enabling NVLink functionality this time around (it was disabled on Titan V), so workstation users can also scale out for more performance with a second Titan RTX by linking up the two cards.

Meanwhile NVIDIA is also chasing after content creators with this card a bit. Data scientists are still the bread and butter, but given that Turing also made significant investments into ray-tracing, NVIDIA would seem to also be experimenting a bit here to see what kind of a market there is for a high-end yet non-Quadro card for ray tracing. Strictly speaking the Quadro 6000 should be superior here (if only due to drivers & support), however it’s also a good deal more expensive. So it will be interesting to see what kind of a market NVIDIA finds for a $2500 ray tracing card that’s not already served by tried & true Quadro or the much cheaper GeForce.

And while NVIDIA is the first to note that the card is not really for gaming, even the Titan V sold to some gamers out there since Titans use the GeForce driver stack, and I expect much the same here. While the potential 15% performance improvement by no means justifies the greater-than 2x jump in cost, for the crazy rich out there, I do expect the Titan RTX to be a little better suited to gaming than the Titan V was. Whereas the Titan V was an awkward card in terms of game support due to the fact that it was the only Volta architecture card to use the GeForce drivers, Turing is everywhere. So the Titan RTX should behave more like a slightly faster 2080 Ti, without so many of the performance inconsistencies we saw when trying to game on the Titan V.

In terms of design, like its predecessors, the Titan RTX also follows very closely in the stylings of the GeForce family. Notably NVIDIA is using an open-air double-fan cooler here, which NVIDIA switched to on this generation, and not a traditional blower like the Titan V or the current Quadro cards. As we’ve already seen on the GeForce cards this maximizes airflow and brings down temperatures, however it’s a bit more of a mixed bag for the Titan since NVIDIA allows pairing the cards up with NVLink. Open air cooled cards require a little more care here, whereas the blowers are pretty much set-it-and-forget-it in a workstation. However with a TDP of 280W – the highest of the Turing cards and 30W higher than the Titan V – one can see why NVIDIA would be interested in maximizing cooling performance above other priorities. This also means that in theory, the Titan RTX should average slightly higher clockspeeds than the Quadro cards, as it has a bit more cooling and TDP headroom to play with; so at least for now, it likely is the fastest of all the TU102 cards.

NVIDIA's Nickname for the Titan RTX is "T-Rex"

Past that, this is a pretty typical card in terms of NVIDIA design. It gets the same port arrangement as the other Quadro and GeForce cards, with 3x DisplayPort 1.4 outputs, an HDMI 2.0b port, and a USB-C port that supports DP alt mode as well as the VirtualLink standard for VR headsets. Unique to the Titan of course is its golden color scheme, least it be confused with a GeForce. NVIDIA has nicknamed the card the T-rex, and I’m fairly sure this is the first time anyone has offered a T-rex in gold.

In any case, for the data scientists and whoever else wants to get their hands on what’s sure to be NVIDIA’s tensor terror for workstations, be prepared to set aside some cash. $2500 to be precise. Atypically for NVIDIA, this price is actually down a bit from the $3000 Titan V – TU102 is cheaper to make, especially without the HBM2 – but it’s still going to be one expensive card. Meanwhile NVIDIA tells us that we should expect to see the card become available on their website later this month.

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

83 Comments

View All Comments

iwod - Monday, December 3, 2018 - link
I misread the Ti being a full TU102 previously, and thought $999 would be a bargain in terms of full ~750mm Die Size.

I do really want to a ~750mm2 7nm GPU. Pushing technology to its limit. Along with may be HBM3? I doubt GDDR6 could keep up in the 7nm era.
rocky12345 - Monday, December 3, 2018 - link
Ok so another RTX card and the price of $2500US and it looks like Nvidia really gimped the 2080Ti Tensor performance as well when you look at the RTX Titans tensor performance on this chart. The 2080Ti costing like $1200US should have had near the same tensor performance as the RTX Titan.
rocky12345 - Monday, December 3, 2018 - link
Not that I wanted to reply to my own comment but no edited functions here. I wanted to add that then maybe the 2080Ti would not have had such a performance drain on it when using RTX features. I guess only time will tell after the reviews of this way over priced RTX Titan comes out and if it just sails through RTX enabled games or should I say BF 5..:)
Ryan Smith - Monday, December 3, 2018 - link
"I wanted to add that then maybe the 2080Ti would not have had such a performance drain on it when using RTX features"

So the feature they disabled is FP16 operations with FP32 accumulate. It's pretty much only necessary for neural network training. Neural network inference doesn't require that level of precision, which is also why Turing introduced even lower precision modes (INT8/INT4). The limited FP32 accumulate should not be holding back the performance of the GeForce cards when using RT/DLSS features.
CoryS - Monday, December 3, 2018 - link
If they allowed these to push a 10 bit display in video and photo editing apps I'd jump on it. It is so frustrating they artificially limit even their titan GPUs in this respect.
Freakie - Tuesday, December 4, 2018 - link
"Meanwhile NVIDIA is also chasing after content creators with this card a bit." A race that they will never win as long as 10-bit color is disabled on the Titan. Such a stupid limitation because digital media creators don't need any of the other features of quadro, literally the only feature needed is 10-bit. So I would rather buy a $300 AMD card instead of a $1,000 Nvdia to get that feature. Really shooting themselves in the foot in that market segment.
Gastec - Monday, December 3, 2018 - link
I'm going out to buy two of these RTX Titans: the first one I'll shove into my a...PC and the second one I'm gonna hang it around my neck from a thick 24 karat gold chain 😎
Lord of the Bored - Tuesday, December 4, 2018 - link
At last, someone with a good use for one of these things!
Gmn17 - Monday, December 3, 2018 - link
Too bad FP64 is gimped otherwise I’d buy 1
Gmn17 - Tuesday, December 4, 2018 - link
Will Nvidia now drop the prices on Titan V and Titan Xp?

NVIDIA Unveils “Titan RTX” Video Card: $2500 Turing Tensor Terror Out Later This Month

Post Your Comment

83 Comments

View All Comments

iwod - Monday, December 3, 2018 - link

rocky12345 - Monday, December 3, 2018 - link

rocky12345 - Monday, December 3, 2018 - link

Ryan Smith - Monday, December 3, 2018 - link

CoryS - Monday, December 3, 2018 - link

Freakie - Tuesday, December 4, 2018 - link

Gastec - Monday, December 3, 2018 - link

Lord of the Bored - Tuesday, December 4, 2018 - link

Gmn17 - Monday, December 3, 2018 - link

Gmn17 - Tuesday, December 4, 2018 - link

Log in

Don't have an account? Sign up now