Quick Notes: Navi Refresh and RDNA2 Both In 2020, According to AMD

by Ryan Smith on January 28, 2020 9:00 PM EST

Posted in
GPUs
AMD
Radeon
RDNA
RDNA2

89 Comments | Add A Comment

89 Comments

As part of today’s FY2019 earnings call, AMD CEO Dr. Lisa Su had a few words to say about AMD’s future GPU plans – an unexpected nugget of information since we weren’t expecting AMD to reveal anything further at this time.

In short, for this year AMD is planning on both Navi product refreshes as well as parts based on the forthcoming RDNA 2 GPU architecture. To quote Lisa Su:

In 2019, we launched our new architecture in GPUs, it's the RDNA architecture, and that was the Navi based products. You should expect that those will be refreshed in 2020 - and we'll have a next generation RDNA architecture that will be part of our 2020 lineup. So we're pretty excited about that, and we'll talk more about that at our financial analyst day. On the data centre GPU side, you should also expect that we'll have some new products in the second half of this year.

All told, it looks like AMD is setting themselves up for a Vega-like release process, launching new silicon to replace their oldest existing silicon, and minting new products based on existing and/or modestly revised silicon for other parts of their product stack. This would be very similar to what AMD did in 2017, where the company launched Vega at the high-end, and refreshed the rest of their lineup with the Polaris based Radeon RX 500 series.

AMD's GPU Roadmap As Of July 2019

But as always, the devil is in the details. And for that, we’ll have to stay tuned for AMD’s financial analyst day in March.

Source: AMD FY2019 Earnings Call

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

89 Comments

View All Comments

Yojimbo - Tuesday, January 28, 2020 - link
But the coin has two sides. If RDNA2 offered improvements other than ray tracing and VRS it would make sense to bring those improvements to their entire lineup rather than making a refresh of a lesser architecture. AMD now should have the money to do it and increase their brand and their competitive position. They could strip the ray tracing out of the die and bring it out, just like NVIDIA did with Turing. Perhaps they intend to do that in 2021 and bring out the Navi refresh in 2020.
SaberKOG91 - Tuesday, January 28, 2020 - link
The 16XX series vs the 20XX series is a closer analog to this. The 16XX series was a completely different die design than the 20XX series. It not only doesn't have RT cores, but the Tensor cores were replaced with just FP16 support.
Yojimbo - Wednesday, January 29, 2020 - link
We don't know how tensor cores work so we can't really say that.
SaberKOG91 - Wednesday, January 29, 2020 - link
We can though. If you do a scatter plot of die size vs shader count, the linear fit for the 16XX dies is y=0.164x + 32 (R^2=1.00) and the linear fit of the 20XX dies is y=0.134x + 134 (R^2=0.998). While I find it curious that the 16XX dies actually have a larger shaders than the 20XX dies (optimization for higher clock speeds?), it's clear that there is a distinct difference in their makeup.
Yojimbo - Wednesday, January 29, 2020 - link
They took out the RT cores. You have no idea how large the RT cores are. Memory controllers are also changing over the various GPUs as is the number of NVENC units, I believe. What we do know, however, is that the tensor cores have always increased lock-step with the shaders. If tensor cores are different cores that were taken out of the 16 series cards then, as you pointed out, the die size per shader for the 16 series cards should be larger, not smaller. Plus RT cores seem to scale with shaders as well, so the proportionate constant should be even smaller in the 16 series. So it seems to me that you showed either that the tensor cores exist in the 16 series but not the 20 series or that either your method or data is insufficient.
TheinsanegamerN - Wednesday, January 29, 2020 - link
That a lot of words to say "no, Saber, I dont like your math, you cant know more then me, You're WRONG!!!!11!!!!

If the 1660 line had tensor cores, then deep learning, DLSS, and raytracing would be available on them. If they were firmwar elocked, someone would have opened them up by now.
Yojimbo - Thursday, January 30, 2020 - link
No it's not. I gave a well-reasoned argument.

"If the 1660 line had tensor cores, then deep learning, DLSS, and raytracing would be available on them."

No. Volta has tensor cores but not raytracing. And NVIDIA has good reason to limit tensor core operations from the 1600 line even if it's artificial or mostly artificial. When has anyone ever opened up NVIDIA's or AMD's locked FP64 in the past? Besides, it doesn't have to be firmware locked. They could make minor changes to physically disrupt the alternate data paths used by the tensor cores, if that were indeed how the tensor cores are done. It seems to me, looking at the differences between Pascal and Volta and looking at the die sizes and performance of the AI accelerators compared to Volta, that there is an awful lot shared between the tensor cores and the shaders. What exactly is shared I don't know, but at a certain point it makes a lot more sense to leave them in there instead of doing the massive amount of work to create new SMs that don't include the tensor core circuitry.
SaberKOG91 - Wednesday, January 29, 2020 - link
Actually, I can account for all of the changes to memory controllers, NVlink, and RT cores in just the change in area from the Y-intercept. The baseline area usage of the 20XX is 4x larger than the 16XX which is more than enough to double the width of the GDDR memory controller, add in the HBM controller, add in NVLink, add in more NVENC, and still have room left over for the RT cores. I was convinced by the die shots on day 1 that the RT cores are not actually integrated into the shaders. There's a huge new block of silicon that isn't accounted for by any of the other features that the 20XX supports and isn't present in the 16XX die shots or Volta before that. And the number of RT Cores available can easily be explained by some being disabled or removed for each die size. Having them scale with the number of CUDA cores is not all that surprising as an engineer.

I don't have to demonstrate anything regarding Tensor cores. Nvidia themselves say they replaced them with FP16.

No, the 16XX series doesn't have to have smaller shaders just because features are removed. The clock speeds are a fair bit higher on the 16XX which can easily be accomplished by bigger transistors. We also don't know if the FP16 implementation they did was a separate unit or a reorganization of the shaders to support native FP16 instead of emulated through FP32, which would also add area.
Yojimbo - Thursday, January 30, 2020 - link
I'm not talking about the RT cores. I'm talking about the tensor cores. and yes it is necessaey to demonstrate something regarding them. NVIDIA is talking about their functionality and also NVIDIA is talking from a marketing standpoint. It would be foolish to believe it is giving an engineering explanation of how its secret sauce us accomplished. If the 20xx shaders+ tensor cores take up less die area than the equivalent number of 16xx shaders then why did they bother building the 16xx shaders? Not for power efficiency or performance, we can see that. The clock speeds seem to be in line with pascal clock speeds across the stock. Note that the 2080 super has the fastest clock speeds among all Turing cards, just like the 1080 does for the Pascal cards.

I'm not saying you are definitely wrong. I'm saying you have not said anything that moves me from my original statement. We don't know. My feeling is that the tensor cores are most likely sharing quite a bit with the shaders. From what I remember, in NVIDIA's graphics, they show RT and shader operations happening simultaneously. They show integer and fp32 operations happening simultaneously. But when tensor operations are in progress that is all that is going on. I think that gives us a big clue, along with the idea I got by looking at die sizes of Pascal, Volta, and a very rough consideration of performance and die sizes of AI accelerators compared to Volta.
SaberKOG91 - Thursday, January 30, 2020 - link
I've spent a lot of time digging around for confirmation from developers and other folks, and I will revise a few of my previous statements. First: Tensor Cores and RT Cores don't exist. Tensor Cores are just another way to move data around inside of an SM to perform matrix operations efficiently. Each SM can be used as 8 tensor cores, but does not have entirely separate and dedicated hardware for it. RT cores do exactly the same thing, but rely on a new bit of logic to perform BVH efficiently. Second, the area that I thought was dedicated to RT Cores is in fact the new "high-speed-hub".

I do still believe my analysis that says that the 16XX dies have larger shaders, but I don't see them as being a simple migration from Volta or Pascal, namely because they do have the scheduler changes and SM organization changes made for Turing that aren't present in those two.

So how do I justify it? Well, let's start with RTX. If you use the card without RT, it's a little bit faster than the 10XX series. If you use DLSS instead of other AA, RTX can be even faster than 10XX because it the operations take a lot less time to compute. That's good enough for most people to buy it. If you use RT though, you'll see a 20+% decrease in performance. For the 2080 Ti, 20% is enough to bring it down to between a 1080 and 1080 Ti. For the 2060, it's closer to 30% which brings it down to 1660 or between a 1060 and a 1070. If you had RT on a 1660, it would drop to 1650 performance which is between a 1050 Ti and a 1060. At the high-end of Turing, the cost of RT being enabled can be kind of ignored, but at the low end it would make doing anything other than RT a lot more constrained. That's not good for gaming or for marketing. So I can see why RT would be disabled entirely for the 16XX series.

What about tensor cores? Well I think that's mostly a money grab. Having tensor cores on the 16XX series would mean that their Quadro counterparts would sell a lot better, taking sales away from the much higher margin 20XX equivalent Quadro cards.

Quick Notes: Navi Refresh and RDNA2 Both In 2020, According to AMD

Post Your Comment

89 Comments

View All Comments

Yojimbo - Tuesday, January 28, 2020 - link

SaberKOG91 - Tuesday, January 28, 2020 - link

Yojimbo - Wednesday, January 29, 2020 - link

SaberKOG91 - Wednesday, January 29, 2020 - link

Yojimbo - Wednesday, January 29, 2020 - link

TheinsanegamerN - Wednesday, January 29, 2020 - link

Yojimbo - Thursday, January 30, 2020 - link

SaberKOG91 - Wednesday, January 29, 2020 - link

Yojimbo - Thursday, January 30, 2020 - link

SaberKOG91 - Thursday, January 30, 2020 - link

Log in

Don't have an account? Sign up now