NVLink

NVLink is NVIDIA’s high-speed, direct interconnect for linking graphics processors to each other and to CPUs. Training and serving large AI models almost always requires splitting the work across many GPUs, and those GPUs must constantly exchange data. The standard PCI Express bus that connects expansion cards in a computer is too slow for this, so NVLink provides a dedicated, much faster path for GPU-to-GPU communication, with a companion chip called NVLink Switch (or NVSwitch) that extends those connections so every GPU in a server or rack can talk to every other at full speed.

Bandwidth has climbed sharply across generations. Fourth-generation NVLink, used with the Hopper H100, provided 900 gigabytes per second per GPU; the fifth generation in Blackwell raised that to 1.8 terabytes per second; and NVIDIA’s sixth generation reaches 3.6 terabytes per second per GPU, more than 14 times the bandwidth of contemporary PCIe. At the rack scale, an NVLink Switch fabric can knit dozens of GPUs into what behaves like one enormous accelerator.

NVLink matters because, as models grow, performance is increasingly limited by how fast data moves rather than how fast each chip computes. For a business reader, NVLink is a large part of why NVIDIA sells systems and racks rather than just chips: the interconnect that ties many GPUs together has become a competitive moat as important as the processors themselves.

Sources

Related