Considering their past performance I'll wait for the independent benchmarks. Regardless though, the SOC wars are looking interesting. Though the PowerVR 6 series (Rogue) should be shipping this quarter, and is rumored to put out around 210 Gflops. We'll have to wait and see on that one as well.
Xbox games often run in 720p, which just goes to show how much faster these SOCs will need to get to drive over four time that resolution with modern graphics engines.
GFLOPS is not the greatest way of measuring performance, but something interesting to keep in mind is that Rogue is still using the same process technology as PowerVR's current stuff but adding more features (DX10 class stuff, essentially). I'm not sure if we noted this elsewhere, but it's possible that Rogue won't be any faster than the current stuff -- it will just support some newer GPU features.
As for Xbox 360, considering it's now over seven years old, I suspect much of the next-gen SoC silicon will at least match it in terms of gaming potential. As usual, it's more about the games than the hardware. Just like on Windows, on Android games need to support a wider array of hardware, so some optimization opportunities get lost.
I wonder what it does to the power efficiency though considering the fact that the Apple custom SoC lag on the CPU front, bear in mind that it'll also take more space on the die itself, not to mention they're pretty infamous for over heating, maybe that's just the cheap aluminium !
Basically what I'm trying to say here is that on the same process node, that you mentioned above, PowerVR has very little to no room for making this an efficient & better performing GPU than the 5x series(relatively speaking) also most of the performance gains would be had by adding more silicon rather than optimizing the GPU, that'll be done perhaps later on just not right now ?
Indeed. ROP count, TMU count and memory bandwidth all contribute to performance. Considering the high resolution displays becoming popular in the mobile space, it'd make sense to scale the number of ROP's and TMU's alongside shader performance. The real telling difference between this and the Xbox 360 in terms of raw hardware is that the Xbox 360 has a 10 MB of eDRAM with an incredible amount of bandwidth to. Hence why the Xbox 360 was often marketed with '4x MSAA for free' to developers (though modern deferred rendering engine can't take advantage of it).
ARM SoC's on the other hand have had relatively narrow memory buses running at low clock speeds, much less a sizable eDRAM pool. Only chips like the A5X and A6X have bandwidth figures approaching the Xbox 360's main memory bus (17 GB/s vs. 22.6 GB/s). PowerVR does have nice trick up its sleeve using tile based rendering. That conserves a good chunk of memory bandwidth for better efficiency but I doubt it'd be enough offset the benefits eDRAM brings to the XBox 360.
You wouldn't need to run at native res for games on these ultra high density displays. Calculating once for every 4 pixel group would suffice. Only use full resolution for GUI elemtens.
GFLOPs are not everything though. The PS Vita is capable of coming reasonably close to PS3 graphics and thats with ARM A9 and SGX543MP4+. The Tegra 4 should pretty much be capable of 360/PS3 level graphics, to the point where most people wouldn't notice a difference anyway. It will also have more RAM to work with, but i'm guessing with less bandwidth.
Also worth noting that during the event they made very little noise to say it was faster than the iPad. They didn't say it wasn't, but barely made any mention of it. Considering Apple's phone and tablet are already out and Tegra 4 will be on a few dozen devices this year, you'd think that would be a major highlight to point out.
As always with Tegra I assume that when they say it's more powerful, they really mean that if developers target their apps for Tegra then they can have better performance (potentially), not for general programming. Of course, considering my knowledge of programming, I don't know exactly what that means...
Rogue can go pretty high - for a cost in power consumption. So just because it can reach a certain level, doesn't mean you'll actually see that in the first GPU models coming out this year. We'll see what Apple uses this spring, but I doubt they'll use anything more than twice as powerful as A6x, after only ~6 months. It might even be only 50% faster.
It would need to be well over a doubling in performance to get up to iPad 4 levels of performance.
We'll see how things are with independent benchmarks. It is bizarre that a company known for its GPU has been playing catchup with PowerVR and Enyxos for so long now.
The iPad 4 with A6X will have been out for quite some time by the time Tegra 4 devices are available. It'd be nice if Nvidia could come up with something that would more than just barely beat the A6X given a perfectly-balanced workload.
I'm firmly in the Android camp but the (comparatively) lacklustre GPUs we get landed with are a big source of annoyance.
Well there's also the IO sub-system, which tends to be very slow in many devices.
I don't know what Apple devices are like, but IO performance is the biggest problem with my Transformer Prime. A fast CPU doesn't help if it's twiddling its thumbs waiting for data.
Well obviously I want it all; their job is to make them faster at the same or lower power.
To be fair if Tegra 4 goes into phones it should be far faster than the A6 (PowerVR SGX543MP3) in the iPhone 5, I'd just like something available for tablets that'd beat the iPad 4, especially since Android high-res tablets are likely to have more pixels to push (e.g. Nexus 10).
Now that android has all the hardware acceleration going on, a faster more powerful GPU will not only be good for games, but really smooth out the whole OS
The GPU in the A6X represents the maximum configuration for the SGX554, the same is true for the 543 in the A5X. There's a reason you don't see MP4 554's or 543's in smartphone form factors. These are high die area, high power consumption GPU configurations. There are tradeoffs.
I think the fact that Nvidia was able to match the theoretical FP performance of Power VR's highest end options at a lower transistor count and substantially lower die size is in itself impressive. If we see Tegra 4 in smartphones, which I'm assuming we will, it'll be quite an impressive accomplishment to say the least.
I'm not sure why this is never a part of the conversation when discussing the value of a mobile SOC, but all the performance the SGX 543 and 554 offer also comes at the price of comparatively high transistor counts and high power consumption. The A5X and A6x at load consume a LOT of power, due in large part to their GPU's. Tegra 3 was clearly never meant to compete with an A5X or A6X, which is why I've always been a bit confused when the anti-Tegra 3 bandwagon would draw comparisons.
There is obviously an advantage to big die to offset the downside of increased cost or no one would design a big chip for mobile devices. I think the explanation is that high transistor count actually allows for low clockspeeds (lower votage) and efficiency. A5X and A6X run at about 250 MHz while Tegra 3 clocks over 500 Mhz. If it was really more efficient to run small chips like 554 MP2 at high clocks then why would we have MP4?
They're completely different architectures, and the optimal clock ranges for a ULP geforce and SGX554 probably vary. I could turn your question around and ask why Apple wouldn't simply use lower clocked versions of an A5X or A6X in their iPhones? There's probably some threshold for a given form factor where it becomes more practical to either add more hardware units, or raise clocks on fewer units to achieve a certain performance target. Take the SGX543 MP3 in the A6 for example.
There's probably some benefit for Apple to take the larger die size approach for their tablet form factor, I'm just not sure it's mainly for power efficiency. Perhaps its simply because they have that additional thermal headroom to work with in an SOC designed specifically for tablets?
I agree that the optimal clock ranges are probably different. If you have to ask the main downsides of using a big die that I can think of would be cost, production time, and motherboard real estate. SGX544 (basically SGX543) is run at 384 MHz in shipping hardware so there is no question that it can go higher. The main benefit of a big die is power efficiency but I can’t find that mentioned in any of Anand’s Apple reviews.
But Tegra does suck compared to Apple hardware from the same generation. :p
1. From what I understand, SGXMP4 is not inherently high power consumption. I think that die area alone is the reason Apple (the only one using SOCs that big in tablets) does not put these chips in phones and that they would be just as power efficient as any other current GPU when run at 1.5W. There are tradeoffs yes, likely cost and motherboard real estate.
2. I disagree that Nvidia using a smaller die at higher clocks to match current PowerVR performance is impressive with no power consumption numbers to back it up. Tegra 4 needs to be efficient at low power to do well in phones and we still don’t know how it will perform in that regard.
3. The only tablets with these hot chips are the iPads which provide a fat battery and better gaming battery life than competing devices. Tegra 3 is also an equally hot chip, if you crunch the gaming power consumption numbers in the AT TF700T review, T33 is drawing about 30% more power than A5X and going much slower.
4 . You are right that Tegra 3 is a cheaper small die part that does not compete with the big high cost MP4 chips.
TLDR; there is good reason the the anti-Tegra bandwagon when cost is not being considered
Apple doesn't have to care about die size so much, because they don't need to sell to others. Plus they have dedicated fabs for their parts. Nvidia is making one part that is being used for both tablet and phone applications, so they don't have the option like Apple of making two size parts. They went with an optimal compromise on size and what could fit in there. A6X is around 30%+ larger than T4, even accounting for fab size.
Indeed and this is the part that people seem to ignore. Apple SOC's cost much more to produce but all Apple needs to worry about is the total BOM. While NVidia actually needs to sell these SOC's and try to make a profit on the chip itself (which it's failing to do ATM). Ex.: A6 costs $20 to produce but HTC needs to pay nVidia $25 for T3. Apple can make profit much easier since they're making the full device (and can charge up to $100 for 16GB of NAND!). R&D costs of developing such SOC's become a small fraction of the total cost when you produce close to 200 million units a year. The only other company in the same position as Apple is Samsung (actually in a better position since they own the fabs (chip, screen, ...) too) and they're moving to bigger SOC's too.
What happened with AMD then ? They had the same advantages as Apple, and all we ever heard was they were a huge financial burden that was not AMD fault whatsoever...
AMD took a different strategy compared to Apple - smaller/cheaper/crashier....
Thus it has nearly destroyed them.
So one can talk advantages all the time, but those with a very similar set often evolve quite differently, one to wild success and a truly dedicated deep pocketed following(appleheads) willing to provide profits and guard their precious IOS babies for years on end, and another teetering on bankruptcy and constant humiliation and penny pinching with a tightwad fanatic user fan base always trying to the very last drop of red AMD blood from the rotten turnip while attacking and blaming everyone else in a now failed PR war that has been **** on the industry for years.
nVidia on the other hand was pushed out of the chipset business and instead of publicly making a big sick stinking hate bomb over it and training their fanboys to take up the cause like AMD would have, they continued to excel in their other base nVidia #1 business while they branched out and aimed for the future - pulling in a fine profit and remaining a top dog.
Project Shield has impressed those who used it hands on, so we already know the tegra4 is a coming winner. Just click the tegra2 tegra3 and tegra4 buttons in the build graph there and "astounding!" sounds correct for the architectural size differences.
If one is concerned about power as a few mentioned, that fifth core is the "idler" that is going to make this chip tremendous in power saving features.
Besides shader ALUs are there any details on changes to TMU and ROP count? Are those also scaled 6x or only 2x corresponding to the doubling of Vec4 pixel units rather than the ALUs themselves?
As well, for you chart, could you add a row stating the "shipping clocks" you are using to calculate your GFLOPS results since that makes things more clear.
Tegra 4 sounds really awesome!! I wish a unified shader architecture were used, but still, if the shipping frequencies are 600MHz or above, the Tegra 4 should be significantly ahead of the A6X, despite the non-unified shader architecture. I think this will be because, even with Tegra 3's non-unified shader architecture, which could only do 12 GFLOPS in total, it wasn't much behind the PowerVR SGX543MP2 in the iPad 2 (with 16 GFLOPS total).
I think that the Tegra 4 will be pretty revolutionary for mobile GPU performance, especially in the Android world, because nor the Mali-T604 nor the PowerVR SGX544MP3 in the Exynos 5 Octa stand a chance against the Tegra 4, even with the older architecture. I only wished NVIDIA used a Mobile Development Platform, like Qualcomm does, so we wouldn't have to wait for an actual device to be released to the market to see the SoC's performance. (If only NVIDIA let people try the SHIELD sample device at CES...)
Tegra 4 will increase performance over last gen parts, but it'll still lack full OpenGL ES 3.0 support. Current gen Mali and Adreno GPUs already support OpenGL ES 3.0, and PowerVR GPUs will begin supporting it with Rogue. That's going to suck for game developers, having one major vendor not support the standard, which means it'll be even longer before a high enough percentage of the market can handle OGLES3.0 that developers are comfortable requiring it, instead of spending extra time, effort, and money to write extra code paths.
If nVidia can get the Tegra 4 out before the next iPad, it'll likely be top tier for a bit... but if they're too slow, newer GPUs will enter the market from all the other vendors (including the first PowerVR Series 6 GPUs) which'll make Tegra 4 look like nothing special, and maybe even look anemic in comparison. So far, nVidia's mobile hype hasn't lived up to their marketing... hopefully the Tegra 4's GPU will be nothing like their older GPUs.
So as per Apple's terse standard method pointed out here, you're saying A7X in retail apple devices will hit the markets less than 10 days after the 1st tegra4 devices... LOL
Whatever... more bloviation from the superspeculator with zero facts and zero history on their side.
A snide smear is not the future applefan or nVidia hater.
Since Nvidia's dull CES event, horrible Tegra 4 intro, perplexing decision to create SHIELD, and the leaked benchmarks, I've been really down on Tegra 4. If this news is true, then I'll probably change my mind. Nvidia cannot afford to come out with a chip slower than what the competition has had out for 3, 6, even 9 months already.
Being on 28nm and using TSMC's lowest leakage process, perf/watt should be considerably better than Samsung's exynos 5. I've also come to the conclusion that SHIELD is using Tegra 4 chips that will end up not making the cut for tablets and phones with respect to clockspeeds / voltages / TDP for to essentially maximize the amount of money they can squeeze out of each Tegra 4 wafer.
Well I do. Android is all about choice; if you want killer 3D you should be able to have it. If you don't need it then you can buy a device more to your taste.
Worse yet, the Apple fanboys don't play 3D, they might play a 1932xx oh sorry 1999 Syndicate clone that ran on a clamshell...
Or they bootcamp and have suckage hardware to deal with for their dollars spent, usually something the rabid AMD fanboy base would scream about - where are they ? Oh that's right, if it's nVidia getting bashed that's them doing it - so no using their 1,000 % always overstated talking point price/perf for now....
What the appleheads do is bask in the glory, true or not, that "they could" play some 3D if they "wanted to". LOL
This site did a tegra3 vs the then current Apple gaming comparison and apple did not win - it lost in fact. Apple looked worse, didn't performa any better, had less features active -
Apple lost to tegra3 in gaming, right here at this site.. I know the article is still up as usual.
There are a variety of claims around the Internet that Tegra 4 supports Direct3D 11 and OpenGL 4. Those have 6 and 5 programmable pipeline stages, respectively. (The sixth, compute shaders, was added by OpenGL in 4.3.) Direct3D has the whole feature_level business that lets companies claim Direct3D 11 compliance while not supporting anything not already in Direct3D 9.0c, but OpenGL doesn't do that.
If it doesn't have unified shaders, then where do the other pipeline stages get executed? Does everything except fragment shaders run on vertex shaders, since any position computations will desperately need the 32-bit precision to avoid massive graphical artifacting? Or are the more recent pipeline stages simply not supported by Tegra 4 at all, and the claims of OpenGL 4 compliance simply wrong?
And now you're claiming that Tegra 4 doesn't even fully support OpenGL ES 3.0, let alone the full OpenGL? OpenGL ES 3.0 still only has two programmable stages, too.
The Direct3D practice you describe is actually gone since DirectX 10.
On DX9 you could indeed claim your card was DX9-capable even though you only supported a single one of the DX9 features.
Since DX10 MS -thankfully- changed this and you are only allowed to put the DX10/11 stamp onto your hardware if you support the full DX10/11 feature set.
Are you sure about that? I've seen mobile GPU's say they support DirectX11.1 but only Direct3D 9.3 feature set. So they might support other stuff in DirectX up to 11.1, but as far as graphics and gaming go, they'll only support the feature set up to 9.3.
I mean, do you actually expect these chips to support tessellation? Even if they did, it would be useless for mobile devices, as they would use too much power. That's why the OpenGL ES standard exists for mobile, separated from the full OpenGL. You can't use the full features yet without consuming a lot of power, and the same is true for DirectX.
Tessellation done right is actually a huge performance optimization.
One classical problem in 3D graphics is how many vertices to use in your models for objects that are supposed to appear curved. Use few vertices and they appear horribly blocky up close. Use a ton of vertices and it's a huge performance hit when you have to process a ton of vertices for objects that are far enough away that a large number of vertices is complete overkill.
Ideally, you'd like to use few vertices for objects that are far away (and most of the time, most objects are far away), many vertices for the few objects that are up close, and interpolate smoothly between them to use not much more the minimum number of vertices to make everything look smooth, regardless of how far away they are. That's exactly what tessellation does.
There is a minimum performance level needed for using tessellation to make sense, though. If you're forced to turn tessellation down far enough that there are obvious jumps in the model whenever the tessellation level changes, it looks terrible. The quad core version of AMD Temash will have plenty of GPU performance for heavy use of tessellation to make sense, and the dual core version might, too. Nvidia Tegra 4 might well have had that level of GPU performance, too, if it supported tessellation.
-----
But you can live without tessellation, and just accept that 3D models will be blocky up close. I'm actually more concerned about missing geometry shaders.
Geometry shaders let you see an entire primitive at once in a programmable shader stage, rather than only one vertex or pixel at a time. They also let you create new primitives or discard old ones on the GPU rather than having to process them all on the CPU and then send them to the GPU. Both of those give you a lot more versatility, and allow you to do a lot of work on the GPU that would otherwise have to be done on the CPU.
And that, I think, should be a huge deal for tablets--more so than for desktops, even. Taking GPU-friendly work and insisting that it has to be done on the CPU instead is not a recipe for good energy efficiency. In a desktop, you may have enough brute force CPU power to make things work even without geometry shaders, albeit inefficiently, but tablets and phones tend not to have a ton of brute force CPU power available.
nVidia has a history of seriously overhyping performance of their products. They did it with Tegra 2 and they did it with Tegra 3 so lets wait for some independent testing before believing them this time.
But they've gone public with a statement that Tegra 4 will beat A6X in GLBenchmark, they will look very douchey if they fail.
Of course there is the remote possibility that they will benchmark a 1080p device against the iPad 4 in Egypt HD and claim victory, even though they lose at Egypt 1080p offscreen.
LOL - anti-nVidia fanboys are the ones who constantly entertain their own distorted world view hype...
A5X vs. Tegra 3 in the Real World
" In situations where a game is available in both the iOS app store as well as NVIDIA's Tegra Zone, NVIDIA generally delivers a comparable gaming experience to what you get on the iPad. In some cases you even get improved visual quality as well. The iPad's GPU performance advantage just isn't evident in those cases—likely because the bulk of iOS devices out there still use far weaker GPUs. That's effectively a software answer to a hardware challenge, but it's true.
NVIDIA isn't completely vindicated however. " by ANANDTECH
You umm... nVidia haterz were saying what ? Oh that's right, you got it 100% incorrect but why not continue repeating lies in the hopes that many peer group fools will believe it too....
On the CPU side it's 4 A15s, all of which are more powerful than Swift, only there's 2x as many of them, and they're clocked faster...that's a no brainer there.
On the GPU side, if Nvidia is saying the segmented architecture makes sense, I believe them. They've got well over half a decade's experience making high end unified designs...they could do so easier than anyone else could, so that they're not means they're almost certainly right about that.
And to the "I'll wait for benchmarks" people, well sure, but just look at the physical size of this versus Tegra 3.
Like...well, every version of Tegra prior to launch, I'm fairly excited about this (save for the issue that I don't know whether I'll ever own anything that runs on this!)
Show me where I can buy a Tegra 4-based device today. Oh, right... they don't exist. The current iPad, meanwhile, has been on the market since October. By the time there's a Tegra 4 tablet, we may be hearing murmurs of an iPad with an A7X inside, and NVIDIA will at best have had a few months' lead before it's eclipsed again. If, of course, it leads at all.
There's no doubt that the Tegra 4 is a big step forward for NVIDIA, but part of why Apple dominates tablets is because it knows how to walk the walk: the interval between announcement and shipping is usually 10 days, and it knows this because it's building the finished hardware. It doesn't have to throw a processor into the wilderness and pray someone decides to use it.
I wonder if GLBenchmark 3.0 with support for OpenGL ES 3.0 will arrive soon. But either way you should at least try to get the 3Dmark11 benchmark, which should come out soon (no OpenGL ES 3.0 support, unfortunately, though), Anand, and test the new GPU's on these new benchmarks.
Oh, that's right I can't yet. I love how someone posts specs on a component and compares it to a shipping product that has been out for months. Too little too late AGAIN NVidia. By the time this is in shipping product Apple will have it's next gen product out already.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
59 Comments
Back to Article
Scannall - Monday, January 14, 2013 - link
Considering their past performance I'll wait for the independent benchmarks. Regardless though, the SOC wars are looking interesting. Though the PowerVR 6 series (Rogue) should be shipping this quarter, and is rumored to put out around 210 Gflops. We'll have to wait and see on that one as well.Kevin G - Monday, January 14, 2013 - link
210 GFLOPs for Rogue? That's Xbox 360 class performance which is 240 GLFOPs for comparison.Zink - Monday, January 14, 2013 - link
Xbox games often run in 720p, which just goes to show how much faster these SOCs will need to get to drive over four time that resolution with modern graphics engines.JarredWalton - Monday, January 14, 2013 - link
GFLOPS is not the greatest way of measuring performance, but something interesting to keep in mind is that Rogue is still using the same process technology as PowerVR's current stuff but adding more features (DX10 class stuff, essentially). I'm not sure if we noted this elsewhere, but it's possible that Rogue won't be any faster than the current stuff -- it will just support some newer GPU features.As for Xbox 360, considering it's now over seven years old, I suspect much of the next-gen SoC silicon will at least match it in terms of gaming potential. As usual, it's more about the games than the hardware. Just like on Windows, on Android games need to support a wider array of hardware, so some optimization opportunities get lost.
blanarahul - Tuesday, January 15, 2013 - link
Agreed. BTW ARM's new GPUs( T604, T624, T628, T658 T678 ) use Unified Shader design, right?alex3run - Saturday, January 26, 2013 - link
Yes, mali t6xx use Unified shader design.R0H1T - Tuesday, January 15, 2013 - link
I wonder what it does to the power efficiency though considering the fact that the Apple custom SoC lag on the CPU front, bear in mind that it'll also take more space on the die itself, not to mention they're pretty infamous for over heating, maybe that's just the cheap aluminium !Basically what I'm trying to say here is that on the same process node, that you mentioned above, PowerVR has very little to no room for making this an efficient & better performing GPU than the 5x series(relatively speaking) also most of the performance gains would be had by adding more silicon rather than optimizing the GPU, that'll be done perhaps later on just not right now ?
Kevin G - Tuesday, January 15, 2013 - link
Indeed. ROP count, TMU count and memory bandwidth all contribute to performance. Considering the high resolution displays becoming popular in the mobile space, it'd make sense to scale the number of ROP's and TMU's alongside shader performance. The real telling difference between this and the Xbox 360 in terms of raw hardware is that the Xbox 360 has a 10 MB of eDRAM with an incredible amount of bandwidth to. Hence why the Xbox 360 was often marketed with '4x MSAA for free' to developers (though modern deferred rendering engine can't take advantage of it).ARM SoC's on the other hand have had relatively narrow memory buses running at low clock speeds, much less a sizable eDRAM pool. Only chips like the A5X and A6X have bandwidth figures approaching the Xbox 360's main memory bus (17 GB/s vs. 22.6 GB/s). PowerVR does have nice trick up its sleeve using tile based rendering. That conserves a good chunk of memory bandwidth for better efficiency but I doubt it'd be enough offset the benefits eDRAM brings to the XBox 360.
MrSpadge - Tuesday, January 15, 2013 - link
You wouldn't need to run at native res for games on these ultra high density displays. Calculating once for every 4 pixel group would suffice. Only use full resolution for GUI elemtens.piroroadkill - Tuesday, January 15, 2013 - link
Yeah, exactly.It'd be nice to have a 720p/1080p toggle to see if people can actually see the difference at normal distances on those screens.
B3an - Monday, January 14, 2013 - link
GFLOPs are not everything though. The PS Vita is capable of coming reasonably close to PS3 graphics and thats with ARM A9 and SGX543MP4+. The Tegra 4 should pretty much be capable of 360/PS3 level graphics, to the point where most people wouldn't notice a difference anyway. It will also have more RAM to work with, but i'm guessing with less bandwidth.Jamezrp - Monday, January 14, 2013 - link
Also worth noting that during the event they made very little noise to say it was faster than the iPad. They didn't say it wasn't, but barely made any mention of it. Considering Apple's phone and tablet are already out and Tegra 4 will be on a few dozen devices this year, you'd think that would be a major highlight to point out.As always with Tegra I assume that when they say it's more powerful, they really mean that if developers target their apps for Tegra then they can have better performance (potentially), not for general programming. Of course, considering my knowledge of programming, I don't know exactly what that means...
Krysto - Monday, January 14, 2013 - link
Rogue can go pretty high - for a cost in power consumption. So just because it can reach a certain level, doesn't mean you'll actually see that in the first GPU models coming out this year. We'll see what Apple uses this spring, but I doubt they'll use anything more than twice as powerful as A6x, after only ~6 months. It might even be only 50% faster.powerarmour - Monday, January 14, 2013 - link
Well, I'd be more surprised if it wasn't tbh!hmaarrfk - Monday, January 14, 2013 - link
Yea seriously. If Apple wasn't as secretive as they are, they would have released their demo of A6X last year around this time.....Mumrik - Tuesday, January 15, 2013 - link
It would be really really weird if nV wasn't able to claim better performance than the iPad 4 at this point.KoolAidMan1 - Tuesday, January 15, 2013 - link
It would need to be well over a doubling in performance to get up to iPad 4 levels of performance.We'll see how things are with independent benchmarks. It is bizarre that a company known for its GPU has been playing catchup with PowerVR and Enyxos for so long now.
BugblatterIII - Monday, January 14, 2013 - link
The iPad 4 with A6X will have been out for quite some time by the time Tegra 4 devices are available. It'd be nice if Nvidia could come up with something that would more than just barely beat the A6X given a perfectly-balanced workload.I'm firmly in the Android camp but the (comparatively) lacklustre GPUs we get landed with are a big source of annoyance.
menting - Monday, January 14, 2013 - link
i'm all for faster GPU speeds, but not if the tradeoff is battery life.People that play stressful 3D games are still in the minority.
quiksilvr - Monday, January 14, 2013 - link
Android chips (generally) have faster CPUs but slower GPUs when compared to the Apple chips.So in other words, despite iOS looking smoother (and running games more smoothly), load times and execution is generally faster in Android.
BugblatterIII - Monday, January 14, 2013 - link
Well there's also the IO sub-system, which tends to be very slow in many devices.I don't know what Apple devices are like, but IO performance is the biggest problem with my Transformer Prime. A fast CPU doesn't help if it's twiddling its thumbs waiting for data.
BugblatterIII - Monday, January 14, 2013 - link
Well obviously I want it all; their job is to make them faster at the same or lower power.To be fair if Tegra 4 goes into phones it should be far faster than the A6 (PowerVR SGX543MP3) in the iPhone 5, I'd just like something available for tablets that'd beat the iPad 4, especially since Android high-res tablets are likely to have more pixels to push (e.g. Nexus 10).
bpear96 - Saturday, January 26, 2013 - link
Now that android has all the hardware acceleration going on, a faster more powerful GPU will not only be good for games, but really smooth out the whole OSdragonsqrrl - Monday, January 14, 2013 - link
The GPU in the A6X represents the maximum configuration for the SGX554, the same is true for the 543 in the A5X. There's a reason you don't see MP4 554's or 543's in smartphone form factors. These are high die area, high power consumption GPU configurations. There are tradeoffs.I think the fact that Nvidia was able to match the theoretical FP performance of Power VR's highest end options at a lower transistor count and substantially lower die size is in itself impressive. If we see Tegra 4 in smartphones, which I'm assuming we will, it'll be quite an impressive accomplishment to say the least.
I'm not sure why this is never a part of the conversation when discussing the value of a mobile SOC, but all the performance the SGX 543 and 554 offer also comes at the price of comparatively high transistor counts and high power consumption. The A5X and A6x at load consume a LOT of power, due in large part to their GPU's. Tegra 3 was clearly never meant to compete with an A5X or A6X, which is why I've always been a bit confused when the anti-Tegra 3 bandwagon would draw comparisons.
Zink - Monday, January 14, 2013 - link
There is obviously an advantage to big die to offset the downside of increased cost or no one would design a big chip for mobile devices. I think the explanation is that high transistor count actually allows for low clockspeeds (lower votage) and efficiency. A5X and A6X run at about 250 MHz while Tegra 3 clocks over 500 Mhz. If it was really more efficient to run small chips like 554 MP2 at high clocks then why would we have MP4?dragonsqrrl - Monday, January 14, 2013 - link
They're completely different architectures, and the optimal clock ranges for a ULP geforce and SGX554 probably vary. I could turn your question around and ask why Apple wouldn't simply use lower clocked versions of an A5X or A6X in their iPhones? There's probably some threshold for a given form factor where it becomes more practical to either add more hardware units, or raise clocks on fewer units to achieve a certain performance target. Take the SGX543 MP3 in the A6 for example.There's probably some benefit for Apple to take the larger die size approach for their tablet form factor, I'm just not sure it's mainly for power efficiency. Perhaps its simply because they have that additional thermal headroom to work with in an SOC designed specifically for tablets?
Zink - Monday, January 14, 2013 - link
I agree that the optimal clock ranges are probably different. If you have to ask the main downsides of using a big die that I can think of would be cost, production time, and motherboard real estate. SGX544 (basically SGX543) is run at 384 MHz in shipping hardware so there is no question that it can go higher. The main benefit of a big die is power efficiency but I can’t find that mentioned in any of Anand’s Apple reviews.But Tegra does suck compared to Apple hardware from the same generation. :p
Zink - Monday, January 14, 2013 - link
1. From what I understand, SGXMP4 is not inherently high power consumption. I think that die area alone is the reason Apple (the only one using SOCs that big in tablets) does not put these chips in phones and that they would be just as power efficient as any other current GPU when run at 1.5W. There are tradeoffs yes, likely cost and motherboard real estate.2. I disagree that Nvidia using a smaller die at higher clocks to match current PowerVR performance is impressive with no power consumption numbers to back it up. Tegra 4 needs to be efficient at low power to do well in phones and we still don’t know how it will perform in that regard.
3. The only tablets with these hot chips are the iPads which provide a fat battery and better gaming battery life than competing devices. Tegra 3 is also an equally hot chip, if you crunch the gaming power consumption numbers in the AT TF700T review, T33 is drawing about 30% more power than A5X and going much slower.
4 . You are right that Tegra 3 is a cheaper small die part that does not compete with the big high cost MP4 chips.
TLDR; there is good reason the the anti-Tegra bandwagon when cost is not being considered
fm123 - Tuesday, January 15, 2013 - link
Apple doesn't have to care about die size so much, because they don't need to sell to others. Plus they have dedicated fabs for their parts. Nvidia is making one part that is being used for both tablet and phone applications, so they don't have the option like Apple of making two size parts. They went with an optimal compromise on size and what could fit in there. A6X is around 30%+ larger than T4, even accounting for fab size.milli - Tuesday, January 15, 2013 - link
Indeed and this is the part that people seem to ignore. Apple SOC's cost much more to produce but all Apple needs to worry about is the total BOM. While NVidia actually needs to sell these SOC's and try to make a profit on the chip itself (which it's failing to do ATM). Ex.: A6 costs $20 to produce but HTC needs to pay nVidia $25 for T3.Apple can make profit much easier since they're making the full device (and can charge up to $100 for 16GB of NAND!). R&D costs of developing such SOC's become a small fraction of the total cost when you produce close to 200 million units a year.
The only other company in the same position as Apple is Samsung (actually in a better position since they own the fabs (chip, screen, ...) too) and they're moving to bigger SOC's too.
CeriseCogburn - Sunday, January 20, 2013 - link
What happened with AMD then ? They had the same advantages as Apple, and all we ever heard was they were a huge financial burden that was not AMD fault whatsoever...
AMD took a different strategy compared to Apple - smaller/cheaper/crashier....
Thus it has nearly destroyed them.
So one can talk advantages all the time, but those with a very similar set often evolve quite differently, one to wild success and a truly dedicated deep pocketed following(appleheads) willing to provide profits and guard their precious IOS babies for years on end, and another teetering on bankruptcy and constant humiliation and penny pinching with a tightwad fanatic user fan base always trying to the very last drop of red AMD blood from the rotten turnip while attacking and blaming everyone else in a now failed PR war that has been **** on the industry for years.
nVidia on the other hand was pushed out of the chipset business and instead of publicly making a big sick stinking hate bomb over it and training their fanboys to take up the cause like AMD would have, they continued to excel in their other base nVidia #1 business while they branched out and aimed for the future - pulling in a fine profit and remaining a top dog.
Project Shield has impressed those who used it hands on, so we already know the tegra4 is a coming winner.
Just click the tegra2 tegra3 and tegra4 buttons in the build graph there and "astounding!" sounds correct for the architectural size differences.
If one is concerned about power as a few mentioned, that fifth core is the "idler" that is going to make this chip tremendous in power saving features.
I want one yesterday.
EnzoFX - Monday, January 14, 2013 - link
Exactly, let alone the advantages Apple has with vertical integration i.e. optimized performance.ltcommanderdata - Monday, January 14, 2013 - link
Besides shader ALUs are there any details on changes to TMU and ROP count? Are those also scaled 6x or only 2x corresponding to the doubling of Vec4 pixel units rather than the ALUs themselves?As well, for you chart, could you add a row stating the "shipping clocks" you are using to calculate your GFLOPS results since that makes things more clear.
Brunelleschi - Monday, January 14, 2013 - link
Tegra 4 sounds really awesome!! I wish a unified shader architecture were used, but still, if the shipping frequencies are 600MHz or above, the Tegra 4 should be significantly ahead of the A6X, despite the non-unified shader architecture. I think this will be because, even with Tegra 3's non-unified shader architecture, which could only do 12 GFLOPS in total, it wasn't much behind the PowerVR SGX543MP2 in the iPad 2 (with 16 GFLOPS total).I think that the Tegra 4 will be pretty revolutionary for mobile GPU performance, especially in the Android world, because nor the Mali-T604 nor the PowerVR SGX544MP3 in the Exynos 5 Octa stand a chance against the Tegra 4, even with the older architecture. I only wished NVIDIA used a Mobile Development Platform, like Qualcomm does, so we wouldn't have to wait for an actual device to be released to the market to see the SoC's performance. (If only NVIDIA let people try the SHIELD sample device at CES...)
KitsuneKnight - Monday, January 14, 2013 - link
Tegra 4 will increase performance over last gen parts, but it'll still lack full OpenGL ES 3.0 support. Current gen Mali and Adreno GPUs already support OpenGL ES 3.0, and PowerVR GPUs will begin supporting it with Rogue. That's going to suck for game developers, having one major vendor not support the standard, which means it'll be even longer before a high enough percentage of the market can handle OGLES3.0 that developers are comfortable requiring it, instead of spending extra time, effort, and money to write extra code paths.If nVidia can get the Tegra 4 out before the next iPad, it'll likely be top tier for a bit... but if they're too slow, newer GPUs will enter the market from all the other vendors (including the first PowerVR Series 6 GPUs) which'll make Tegra 4 look like nothing special, and maybe even look anemic in comparison. So far, nVidia's mobile hype hasn't lived up to their marketing... hopefully the Tegra 4's GPU will be nothing like their older GPUs.
DERSS - Tuesday, January 15, 2013 - link
A7X will probably debut before actual Tegra 4 devices will ship, though.CeriseCogburn - Sunday, January 20, 2013 - link
So as per Apple's terse standard method pointed out here, you're saying A7X in retail apple devices will hit the markets less than 10 days after the 1st tegra4 devices...LOL
Whatever... more bloviation from the superspeculator with zero facts and zero history on their side.
A snide smear is not the future applefan or nVidia hater.
tviceman - Monday, January 14, 2013 - link
Since Nvidia's dull CES event, horrible Tegra 4 intro, perplexing decision to create SHIELD, and the leaked benchmarks, I've been really down on Tegra 4. If this news is true, then I'll probably change my mind. Nvidia cannot afford to come out with a chip slower than what the competition has had out for 3, 6, even 9 months already.Being on 28nm and using TSMC's lowest leakage process, perf/watt should be considerably better than Samsung's exynos 5. I've also come to the conclusion that SHIELD is using Tegra 4 chips that will end up not making the cut for tablets and phones with respect to clockspeeds / voltages / TDP for to essentially maximize the amount of money they can squeeze out of each Tegra 4 wafer.
joos2000 - Monday, January 14, 2013 - link
I would like to see an added row showing GFLOPS/Watt.Ryan Smith - Monday, January 14, 2013 - link
GFLOPS/watt varies non-linearly with clockspeed, so it would be extremely product specific at best.alexvoda - Monday, January 14, 2013 - link
Adding the Apple A6 and the Samsung Exynos 5 Dual would be nice for comparison.StormyParis - Monday, January 14, 2013 - link
Do we know what percentage of tablet users actually play demanding 3D games ? Out of 5-6 people off the top of my head, no one does.BugblatterIII - Monday, January 14, 2013 - link
Well I do. Android is all about choice; if you want killer 3D you should be able to have it. If you don't need it then you can buy a device more to your taste.CeriseCogburn - Sunday, January 20, 2013 - link
Worse yet, the Apple fanboys don't play 3D, they might play a 1932xx oh sorry 1999 Syndicate clone that ran on a clamshell...Or they bootcamp and have suckage hardware to deal with for their dollars spent, usually something the rabid AMD fanboy base would scream about - where are they ? Oh that's right, if it's nVidia getting bashed that's them doing it - so no using their 1,000 % always overstated talking point price/perf for now....
What the appleheads do is bask in the glory, true or not, that "they could" play some 3D if they "wanted to". LOL
This site did a tegra3 vs the then current Apple gaming comparison and apple did not win - it lost in fact.
Apple looked worse, didn't performa any better, had less features active -
Apple lost to tegra3 in gaming, right here at this site.. I know the article is still up as usual.
gbanfalvi - Wednesday, January 23, 2013 - link
You're not talking about this one then, are you :Dhttp://www.anandtech.com/show/5163/asus-eee-pad-tr...
Quizzical - Monday, January 14, 2013 - link
There are a variety of claims around the Internet that Tegra 4 supports Direct3D 11 and OpenGL 4. Those have 6 and 5 programmable pipeline stages, respectively. (The sixth, compute shaders, was added by OpenGL in 4.3.) Direct3D has the whole feature_level business that lets companies claim Direct3D 11 compliance while not supporting anything not already in Direct3D 9.0c, but OpenGL doesn't do that.If it doesn't have unified shaders, then where do the other pipeline stages get executed? Does everything except fragment shaders run on vertex shaders, since any position computations will desperately need the 32-bit precision to avoid massive graphical artifacting? Or are the more recent pipeline stages simply not supported by Tegra 4 at all, and the claims of OpenGL 4 compliance simply wrong?
And now you're claiming that Tegra 4 doesn't even fully support OpenGL ES 3.0, let alone the full OpenGL? OpenGL ES 3.0 still only has two programmable stages, too.
dgschrei - Tuesday, January 15, 2013 - link
The Direct3D practice you describe is actually gone since DirectX 10.On DX9 you could indeed claim your card was DX9-capable even though you only supported a single one of the DX9 features.
Since DX10 MS -thankfully- changed this and you are only allowed to put the DX10/11 stamp onto your hardware if you support the full DX10/11 feature set.
Krysto - Wednesday, January 16, 2013 - link
Are you sure about that? I've seen mobile GPU's say they support DirectX11.1 but only Direct3D 9.3 feature set. So they might support other stuff in DirectX up to 11.1, but as far as graphics and gaming go, they'll only support the feature set up to 9.3.I mean, do you actually expect these chips to support tessellation? Even if they did, it would be useless for mobile devices, as they would use too much power. That's why the OpenGL ES standard exists for mobile, separated from the full OpenGL. You can't use the full features yet without consuming a lot of power, and the same is true for DirectX.
Quizzical - Wednesday, January 16, 2013 - link
Tessellation done right is actually a huge performance optimization.One classical problem in 3D graphics is how many vertices to use in your models for objects that are supposed to appear curved. Use few vertices and they appear horribly blocky up close. Use a ton of vertices and it's a huge performance hit when you have to process a ton of vertices for objects that are far enough away that a large number of vertices is complete overkill.
Ideally, you'd like to use few vertices for objects that are far away (and most of the time, most objects are far away), many vertices for the few objects that are up close, and interpolate smoothly between them to use not much more the minimum number of vertices to make everything look smooth, regardless of how far away they are. That's exactly what tessellation does.
There is a minimum performance level needed for using tessellation to make sense, though. If you're forced to turn tessellation down far enough that there are obvious jumps in the model whenever the tessellation level changes, it looks terrible. The quad core version of AMD Temash will have plenty of GPU performance for heavy use of tessellation to make sense, and the dual core version might, too. Nvidia Tegra 4 might well have had that level of GPU performance, too, if it supported tessellation.
-----
But you can live without tessellation, and just accept that 3D models will be blocky up close. I'm actually more concerned about missing geometry shaders.
Geometry shaders let you see an entire primitive at once in a programmable shader stage, rather than only one vertex or pixel at a time. They also let you create new primitives or discard old ones on the GPU rather than having to process them all on the CPU and then send them to the GPU. Both of those give you a lot more versatility, and allow you to do a lot of work on the GPU that would otherwise have to be done on the CPU.
And that, I think, should be a huge deal for tablets--more so than for desktops, even. Taking GPU-friendly work and insisting that it has to be done on the CPU instead is not a recipe for good energy efficiency. In a desktop, you may have enough brute force CPU power to make things work even without geometry shaders, albeit inefficiently, but tablets and phones tend not to have a ton of brute force CPU power available.
Ryan Smith - Tuesday, January 15, 2013 - link
Any claims that T4 supports D3D11 or OpenGL 4 would be incorrect. As you correctly note, it has no way to support those missing pipeline stages.Formul - Monday, January 14, 2013 - link
nVidia has a history of seriously overhyping performance of their products. They did it with Tegra 2 and they did it with Tegra 3 so lets wait for some independent testing before believing them this time.JomaKern - Tuesday, January 15, 2013 - link
But they've gone public with a statement that Tegra 4 will beat A6X in GLBenchmark, they will look very douchey if they fail.Of course there is the remote possibility that they will benchmark a 1080p device against the iPad 4 in Egypt HD and claim victory, even though they lose at Egypt 1080p offscreen.
CeriseCogburn - Sunday, January 20, 2013 - link
LOL - anti-nVidia fanboys are the ones who constantly entertain their own distorted world view hype...A5X vs. Tegra 3 in the Real World
" In situations where a game is available in both the iOS app store as well as NVIDIA's Tegra Zone, NVIDIA generally delivers a comparable gaming experience to what you get on the iPad. In some cases you even get improved visual quality as well. The iPad's GPU performance advantage just isn't evident in those cases—likely because the bulk of iOS devices out there still use far weaker GPUs. That's effectively a software answer to a hardware challenge, but it's true.
NVIDIA isn't completely vindicated however. " by ANANDTECH
http://www.anandtech.com/show/5688/apple-ipad-2012...
You umm... nVidia haterz were saying what ? Oh that's right, you got it 100% incorrect but why not continue repeating lies in the hopes that many peer group fools will believe it too....
Wolfpup - Tuesday, January 15, 2013 - link
On the CPU side it's 4 A15s, all of which are more powerful than Swift, only there's 2x as many of them, and they're clocked faster...that's a no brainer there.On the GPU side, if Nvidia is saying the segmented architecture makes sense, I believe them. They've got well over half a decade's experience making high end unified designs...they could do so easier than anyone else could, so that they're not means they're almost certainly right about that.
And to the "I'll wait for benchmarks" people, well sure, but just look at the physical size of this versus Tegra 3.
Like...well, every version of Tegra prior to launch, I'm fairly excited about this (save for the issue that I don't know whether I'll ever own anything that runs on this!)
prashanth041 - Tuesday, January 15, 2013 - link
somethimg nice to seeshodanshok - Tuesday, January 15, 2013 - link
It seems quite similar to a 12 pipeline NV40 ;)Anyway it will be interesting to see the first benchmarks...
Commodus - Wednesday, January 16, 2013 - link
You need a shipping product.Show me where I can buy a Tegra 4-based device today. Oh, right... they don't exist. The current iPad, meanwhile, has been on the market since October. By the time there's a Tegra 4 tablet, we may be hearing murmurs of an iPad with an A7X inside, and NVIDIA will at best have had a few months' lead before it's eclipsed again. If, of course, it leads at all.
There's no doubt that the Tegra 4 is a big step forward for NVIDIA, but part of why Apple dominates tablets is because it knows how to walk the walk: the interval between announcement and shipping is usually 10 days, and it knows this because it's building the finished hardware. It doesn't have to throw a processor into the wilderness and pray someone decides to use it.
Krysto - Wednesday, January 16, 2013 - link
I wonder if GLBenchmark 3.0 with support for OpenGL ES 3.0 will arrive soon. But either way you should at least try to get the 3Dmark11 benchmark, which should come out soon (no OpenGL ES 3.0 support, unfortunately, though), Anand, and test the new GPU's on these new benchmarks.MikeHonet - Sunday, January 20, 2013 - link
Oh, that's right I can't yet. I love how someone posts specs on a component and compares it to a shipping product that has been out for months. Too little too late AGAIN NVidia. By the time this is in shipping product Apple will have it's next gen product out already.