GPU Computing

A graphics processing unit, or GPU, was originally built to draw images on a screen. Rendering 3D graphics means doing the same simple arithmetic over and over across millions of pixels at once, so GPUs were designed with thousands of small cores running in parallel rather than the handful of powerful cores in a CPU. It happens that training a neural network requires exactly the same kind of work: huge numbers of multiply-and-add operations performed on grids of numbers (matrices), all at the same time. The chip built for graphics turned out to be a near-perfect engine for neural networks.

What made this usable was making the GPU programmable for general math. NVIDIA’s CUDA platform, whose first public toolkit shipped in 2007, exposed the GPU as a straightforward parallel computer rather than a graphics-only device. Before CUDA, using a GPU for non-graphics calculation meant awkwardly disguising the math as drawing operations. Afterward, researchers could write ordinary parallel programs that ran across the GPU’s many cores, and the matrix operations at the heart of neural networks were a natural fit.

The decisive demonstration came in 2012 with AlexNet. Its authors trained the network on NVIDIA GPUs and wrote plainly in their paper that “to make training faster, we used non-saturating neurons and a very efficient GPU implementation of convolutional nets.” The network had 60 million parameters, a scale that would have been computationally prohibitive on the CPUs of the day. GPUs are what turned deep neural networks from an interesting idea into a practical tool, and the resulting demand helped make NVIDIA one of the most valuable companies in the world.

Why business readers should care: GPUs are the physical bottleneck of the AI economy. The cost, availability, and performance of these chips set the pace at which AI products can be trained and run, which is why GPU supply, cloud GPU pricing, and chip export policy now show up directly in business and geopolitical strategy.

Sources

Related