Comments Locked

22 Comments

Back to Article

  • invinciblegod - Tuesday, March 27, 2018 - link

    Is it not $1.5 million dollars?
  • olde94 - Tuesday, March 27, 2018 - link

    nope he spend quite a lot of time on saying: only 400k!
    which is EXTREMELY awesome!
  • mode_13h - Wednesday, March 28, 2018 - link

    Well, it's more than double what a DGX-1 costs. Of course the GFLOPS/$ won't scale linearly, but it's not exactly awesome. You'd better have workloads that won't fit 8x V100's, if you're buying this.
  • olde94 - Tuesday, March 27, 2018 - link

    will this technology only work in the DGX-2 enclosure or can we see consumer cards connected as we had with the sli bridge?
  • Ryan Smith - Tuesday, March 27, 2018 - link

    You wouldn't need an NVSwitch to link consumer cards. You could just directly bridge them using their native NVLinks, like the Quadro GP100/GV100 does. That said, there's current;y no sign of NVLink coming to consumer graphics GPUs (it's a lot of pins and a lot of real estate).
  • mode_13h - Wednesday, March 28, 2018 - link

    Titan V has the connector, but it's not enabled.
  • iteratix - Tuesday, March 27, 2018 - link

    ...But how does it mine crypto?
  • Yojimbo - Tuesday, March 27, 2018 - link

    Current algorithms don't seem to need fast GPU to GPU communications. I guess if someone designed an algorithm that needed hundreds of GB of memory to be loaded at a time in such a way that it couldn't be broken up into separate chunks when processing...
  • mode_13h - Wednesday, March 28, 2018 - link

    In the Q&A someone asked him about crypto. He said crypto isn't as dependent on system architecture as AI, which is true. I mean, you just have to look at how well rigs do with x1-lane connected GPUs to see that it's not about GPU-GPU connectivity.

    He also poured a decent helping of scorn on the whole crypto craze, but I don't know if that was just pandering to their other customers. By now, miners are probably used to all the trash talk.
  • Yojimbo - Wednesday, March 28, 2018 - link

    "In the Q&A someone asked him about crypto. He said crypto isn't as dependent on system architecture as AI, which is true"

    I think what he meant by that is that crypto does not benefit from software and system ecosystem development, so NVIDIA is not interested in going after the market. In AI, for instance, the development of CuDNN, TensorRT, their GPU Cloud registry, etc., enables NVIDIA to add value to the sale of their hardware, differentiating their offerings from the competition and creating a platform. That enables more predictable demand and higher margins. The opportunity for that does not exist in crypto. It is just a chip business, and NVIDIA doesn't want to be a chip company.

    "He also poured a decent helping of scorn on the whole crypto craze, but I don't know if that was just pandering to their other customers."

    I don't think he's just pandering to other customers with his negative statements about crypto. He probably sees crypto as more of a nuisance than anything at the moment. First of all, it doesn't lend itself to product differentiation which would lead to a competitive advantage for NVIDIA if they execute well, as mentioned in my above paragraph. Secondly, it is helping their competition more than it is helping them. Thirdly, it is volatile and unpredictable. Fourthly, that volatility interferes with NVIDIA's ability to address what they view as their long-term and high value markets.
  • mode_13h - Friday, March 30, 2018 - link

    Ah, but for all those negatives, it's doing two things for them.

    1. Moving a lot of GPUs. This translates into revenue they can re-invest in their product stack.

    2. Pushing customers towards higher-margin Quadro and Tesla solutions, which turns into even more revenue.

    I think any tears he may shed over crypto are most definitely crocodile tears. They might view it as problematic, but it's sure a high-class problem to have.
  • Yojimbo - Friday, March 30, 2018 - link

    1. NVIDIA has plenty of cash and profits. Their long term markets are more important
    2. How many people are buying a Quadro or a Tesla instead of a GeForce?

    Certainly NVIDIA loves more demand, but they don't want volatile demand. They also don't want current generation cards coming back into the market if there is a crypt crash. Another headache. NVIDIA is doing just fine without crypto. They really don't need it strengthening AMD's research budget, disrupting their market visibility, and especially they don't need it to push gamers away from PC gaming.
  • Dug - Tuesday, March 27, 2018 - link

    I don't know anything about the gpu, but I want to know more on how it handles the 30TB of NVMe-based solid state storage. Is it a custom controller or interface and does it run off of the pci-e?
  • Ryan Smith - Tuesday, March 27, 2018 - link

    If it's anything like the DGX-1, then it's just multiple SSDs in a single system.
  • Elstar - Tuesday, March 27, 2018 - link

    30 TB is not a lot for a box this expensive and this complicated. It is probably just plain NVMe.
  • The Hardcard - Tuesday, March 27, 2018 - link

    What is the TDP for pumping 900 GB/s. Liquid cooling I presume?
  • mode_13h - Wednesday, March 28, 2018 - link

    One of the slides says "3,200 Watts of maximum system power" in 3U.
  • Yojimbo - Wednesday, March 28, 2018 - link

    I don't think it has liquid cooling. Here's a picture of it pulled apart: https://images.anandtech.com/doci/12587/screenshot...

    Each NVSwitch is 2 billion transistors. So I imagine the power draw of the switch fabric is not insignificant. I really have no idea how it works but I'm just going to through out a guess anyway. I'd imagine that a switch fabric would be less efficient than NVLink. NVLink 2 allows 300 GB/s bi-directional throughput and NVLINK-enabled GPUs have 50 W higher TDP. I have no idea if it at all makes sense to figure it this way, but if NVSwitch is less efficient and allows 900 GB/s bi-directional throughput then the TDP of one switch should be over 150 W. The DGX-2 has 12 NVSwitches, so that would be 1800 W of TDP for the NVSwitch fabric in the DGX-2. Of course, maybe there is some power savings as some of the NVLink power usage on the GPUs is shifted to the switch when the fabric is used? Real world power usage would depend a lot on how much cross GPU communications can be kept down, of course.
  • Yojimbo - Wednesday, March 28, 2018 - link

    Or maybe the switch itself doesn't see that sort of power usage because a lot of the power overhead is still taken up by the NVLink on the GPU. Either way, the power cost of using all that bandwidth is significant compared to if you had 16 GPUs that couldn't talk to each other, or only did so slowly.
  • Yojimbo - Wednesday, March 28, 2018 - link

    The TDP of 16 non-NVLink V100s is 250 W, BTW. So 16 of them would be 4,000 W TDP. Could the use of this switch really raise TDP by 45% to 5,800 W? Got me... You also have to consider the power savings from the PCI Express switches that your are replacing.
  • Yojimbo - Thursday, March 29, 2018 - link

    According to Rick Merritt at EE Times each switch chip is 100 W.
  • mode_13h - Wednesday, March 28, 2018 - link

    If you see the slide showing the V100 GPUs each connected to 6x NVSwitches, then the two banks of switches are connected across to each other, it reminds me of the hemispheres of a brain.

    Coincidence, I'm sure, but cute.

Log in

Don't have an account? Sign up now