The AMD Radeon VII Review: An Unexpected Shot At The High-End
by Nate Oh on February 7, 2019 9:00 AM ESTFP64 Performance and Separating Radeon VII from Radeon Instinct MI50
One of the interesting and amusing consequences of the Radeon VII launch is that for the first time in quite a while, AMD has needed to seriously think about how they’re going to differentiate their consumer products from their workstation/server products. While AMD has continued to offer workstation and server hardware via the Radeon Pro and Radeon Instinct series, the Vega 20 GPU is AMD’s first real server-grade GPU in far too long. So, while those products were largely differentiated by the software features added to their underlying consumer-grade GPUs, Radeon VII brings some new features that aren’t strictly necessary for consumers.
It may sound like a trivial matter – clearly AMD should just leave everything enabled – but as the company is trying to push into the higher margin server business, prosumer products like the Radeon VII are in fact a tricky proposition. AMD needs to lock away enough of the server functionality of the Vega 20 GPU that they aren’t selling the equivalent of a Radeon Instinct MI50 for a fraction of the price. On the other hand, it’s in their interest to expose some of these features in order to make the Radeon VII a valuable card in its own right (one that can justify a $699 price tag), and to give developers a taste of what AMD’s server hardware can do.
Case in point is the matter of FP64 performance. As we noted in our look at the Vega 20 GPU, Vega 20’s FP64 performance is very fast: it’s one-half the FP32 rate, or 6.9 TFLOPS. This is one of the premium features of Vega 20, and since Radeon VII was first announced back at CES, the company has been struggling a bit to decide how much of that performance to actually make available to the Radeon VII. At the time of its announcement, we were told that the Radeon VII would have unrestricted (1/2) FP64 performance, only to later be told that it would be 1/8. Now, with the actual launch of the card upon us, AMD has made their decision: they’ve split it down the middle and are doing a 1/4 rate.
Looking to clear things up, AMD put out a statement:
The Radeon VII graphics card was created for gamers and creators, enthusiasts and early adopters. Given the broader market Radeon VII is targeting, we were considering different levels of FP64 performance. We previously communicated that Radeon VII provides 0.88 TFLOPS (DP=1/16 SP). However based on customer interest and feedback we wanted to let you know that we have decided to increase double precision compute performance to 3.52 3.46 TFLOPS (DP=1/4SP).
If you looked at FP64 performance in your testing, you may have seen this performance increase as the VBIOS and press drivers we shared with reviewers were pre-release test drivers that had these values already set. In addition, we have updated other numbers to reflect the achievable peak frequency in calculating Radeon VII performance as noted in the [charts].
The end result is that while the Radeon VII won’t be as fast as the MI60/MI50 when it comes to FP64 compute, AMD is going to offer the next best thing, just one step down from those cards.
At 3.5 TLFLOPS of theoretical FP64 performance, the Radeon VII is in a league of its own for the price. There simply aren’t any other current-generation cards priced below $2000 that even attempt to address the matter. All of NVIDIA’s GeForce cards and all of AMD’s other Radeon cards straight-up lack the necessary hardware for fast FP64. The next closest competitor to the Radeon VII in this regard is NVIDIA’s Titan V, at more than 4x the price.
It’s admittedly a bit of a niche market, especially when so much of the broader industry focus is on AI and neural network performance. But there’s none the less going to be some very happy data scientists out there, especially among academics.
AMD Server Accelerator Specification Comparison | ||||
Radeon VII | Radeon Instinct MI50 |
Radeon Instinct MI25 |
FirePro S9170 | |
Stream Processors | 3840 (60 CUs) |
3840 (60 CUs) |
4096 (64 CUs) |
2816 (44 CUs) |
ROPs | 64 | 64 | 64 | 64 |
Base Clock | 1450MHz | 1450MHz | 1400MHz | - |
Boost Clock | 1750MHz | 1746MHz | 1500MHz | 930MHz |
Memory Clock | 2.0Gbps HBM2 | 2.0Gbps HBM2 | 1.89Gbps HBM2 | 5Gbps GDDR5 |
Memory Bus Width | 4096-bit | 4096-bit | 2048-bit | 512-bit |
Half Precision | 27.6 TFLOPS | 26.8 TFLOPS | 24.6 TFLOPS | 5.2 TFLOPS |
Single Precision | 13.8 TFLOPS | 13.4 TFLOPS | 12.3 TFLOPS | 5.2 TFLOPS |
Double Precision | 3.5 TFLOPS (1/4 rate) |
6.7 TFLOPS (1/2 rate) |
768 GFLOPS (1/16 rate) |
2.6 TFLOPS (1/2 rate) |
DL Performance | ? | 53.6 TFLOPS | 12.3 TFLOPS | 5.2 TFLOPS |
VRAM | 16GB | 16GB | 16GB | 32GB |
ECC | No | Yes (full-chip) | Yes (DRAM) | Yes (DRAM) |
Bus Interface | PCIe Gen 3 | PCIe Gen 4 | PCIe Gen 3 | PCIe Gen 3 |
TDP | 300W | 300W | 300W | 275W |
GPU | Vega 20 | Vega 20 | Vega 10 | Hawaii |
Architecture | Vega (GCN 5) |
Vega (GCN 5) |
Vega (GCN 5) |
GCN 2 |
Manufacturing Process | TSMC 7nm | TSMC 7nm | GloFo 14nm | TSMC 28nm |
Launch Date | 02/07/2019 | 09/2018 | 06/2017 | 07/2015 |
Launch Price (MSRP) | $699 | - | - | $3999 |
Speaking of AI, it should be noted that machine learning performance is another area where AMD is throttling the card. Unfortunately, more details aren’t available at this time. But given the unique needs of the ML market, I wouldn’t be surprised to find that INT8/INT4 performance is held back a bit on the Radeon VII. Or for that matter certain FP16 dot products.
Also on the chopping block is full-chip ECC support. Thanks to the innate functionality of HBM2, all Vega cards already have free ECC for their DRAM. However Vega 20 takes this one step further with ECC protection for its internal caches, and this is something that the Radeon VII doesn’t get access to.
Finally, Radeon VII also cuts back a bit on Vega 20’s off-chip I/O features. Though AMD hasn’t made a big deal of it up to now, Vega 20 is actually their first PCI-Express 4.0-capable GPU, and this functionality is enabled on the Radeon Instinct cards. However for Radeon VII, this isn’t being enabled, and the card is being limited to PCIe 3.0 speeds (so future Zen 2 buyers won’t quite have a PCIe 4.0 card to pair with their new CPU). Similarly, the external Infinity Fabric links for multi-GPU support have been disabled, so the Radeon VII will only be a solo act.
On the whole, there’s nothing very surprising about AMD’s choices here, especially given Radeon VII’s target market and target price. But these are notable exclusions that are going to matter to certain users. And if not to drive those users towards a Radeon Instinct, then they’re sure to drive those users towards the inevitable Vega 20-powered Radeon Pro.
289 Comments
View All Comments
zodiacfml - Friday, February 8, 2019 - link
The first part of your conclusion describes what this product is. It is surprising to see this card's existence at 7nm, a Vega with 16GB of HBM2.It appears to me that AMD/TSMC is learning the 7nm process for GPUs/CPUs and the few chips they produce be sold as a high end part (as the volume/yields is being improved).
AMD really shot high with its power consumption (clocks) and memory to reach the pricing of the GTX 2080.
However, I haven't seen a publisher to show undervolting results. Most Vegas perform better with this tweak.
Samus - Saturday, February 9, 2019 - link
I think you are being a little too critical of this card. Considering it’s an older architecture, it’s impressive it’s in the 2080’s ballpark.And for those like me that only care about Frostbite Engin based games, this card is obviously a better option between the two cards at the same price.
You also ignored the overclockong potential of the headroom given by moving to 7nm
D. Lister - Saturday, February 9, 2019 - link
"You also ignored the overclockong potential of the headroom given by moving to 7nm"Unfortunately it seems to be already overclocked to the max on the core. VRAM has some headroom but another couple of hundred MHz isn't going to do wonders considering the already exorbitant amount available.
D. Lister - Saturday, February 9, 2019 - link
*...considering the already exorbitant amount [of bandwidth] available.Oxford Guy - Saturday, February 9, 2019 - link
"I think you are being a little too critical of this card."Unless someone can take advantage of the non-gaming aspects of it, it is dead in the water at the current price point. There is zero reason to purchase a card, for gaming only, that uses more power and creates vastly more noise at the same price point of one that is much more efficient for gaming purposes. And, the only way to tame the noise problem is to either massively undervolt it or give it water. Proponents of this GPU are going to have to show that it's possible to buy a 3 slot model and massively undervolt it to get noise under control with air. Otherwise, the claim is vaporware.
Remember this information? Fiji: 596 mm2 for $650. Vega 10 495 mm2 for $500. Vega 20 331 mm2 for $700.
Yes, the 16 GB of RAM costs AMD money but it's irrelevant for gaming. AMD not only gave the community nearly 600 mm2 of chip it paired it with an AIO to tame the noise. All the talk from Su about improving AMD's margins seems to be something that gamers need to stop lauding AMD about and starting thinking critically about. If a company only has an inferior product to offer and wants to improve margins that's going to require that buyers be particularly foolish.
Samus - Sunday, February 10, 2019 - link
I wouldn't call the 16GB irrelevant. It trumps the 2080 in the two most demanding 4K titles, and comes relatively close in other ultra high resolution benchmarks.It could be assumed that's a sign of things to come as resolutions continue to increase.
Oxford Guy - Sunday, February 10, 2019 - link
"It could be assumed that's a sign of things to come as resolutions continue to increase."Developers adapt to Nvidia, not to AMD. That appears to be why, for instance, the visuals in Witcher 3 were watered-down at the last minute — to fit the VRAM of the then standard 970. Particularly in the context of VRAMgate there was an incentive on the part of Nvidia to be certain that the 970's VRAM would be able to handle a game like that one.
AMD could switch all of its discreet cards to 32 GB tomorrow and no developers would bite unless AMD pays them to, which means a paucity of usefulness of that 32 GB.
BenSkywalker - Saturday, February 9, 2019 - link
This offering is truly a milestone in engineering.The Radeon VII has none of the RTX or tensor cores of the competition, uses markedly more power *and* is built with a half node process advantage and still, inexplicably, is slower than their direct competitor?
I've gone back and looked, I can't find another example that's close to this.
Either TSMC has *massive* problems with 7 nm or AMD has redefined terrible engineering in this segment. One of those, at least, has to be at play here.
Oxford Guy - Saturday, February 9, 2019 - link
The RTX and Tensor die area may help with power dissipation when it's shut down, in terms of hot spot reduction for instance. Vega 20 is only 331 mm2. However, it does seem clear enough that Fiji/Vega is only to be considered a gaming-centric architecture in the context of developers creating engines that take advantage of it, à la DOOM.Since developers don't have an incentive to do that (even DOOM's engine is apparently a one-off), here we are with what looks like a card designed for compute and given to gamers as an expensive and excessively loud afterthought.
Oxford Guy - Saturday, February 9, 2019 - link
There is also the issue of blasting clocks to compensate for the small die. Rip out all of the irrelevant bits and add more gaming hardware. Drop the VRAM to 8 GB. Make a few small tweaks to improve efficiency rather than just shrink Vega. With those things done I wonder how much better the efficiency/performance would be.