Dennard Scaling

Dennard scaling is the principle that explains why, for several decades, shrinking transistors did not just pack more logic onto a chip but also made each generation faster and more power-efficient at the same time. The insight is that if you scale down a MOSFET’s dimensions and voltages together by the same factor, the power consumed per unit of chip area stays about constant. Smaller therefore meant cooler per area as well as quicker, and designers could raise clock frequencies generation after generation without melting the chip.

The result was published in 1974 by Robert H. Dennard and his co-authors at IBM in the IEEE Journal of Solid-State Circuits, in the paper “Design of Ion-Implanted MOSFETs with Very Small Physical Dimensions.” The paper set out a systematic scaling methodology, describing how device dimensions, doping levels, and supply voltages should be reduced together so that a smaller transistor would behave like a scaled copy of a larger one with predictable, improved electrical characteristics.

For roughly thirty years this held. Each new process node delivered transistors that were smaller, switched faster, and drew less power per area, so processor clock speeds climbed steadily and the gains arrived almost for free from the manufacturing side. Dennard scaling, working alongside Moore’s Law, was the engine behind the steady rise in single-thread performance through the 1980s and 1990s.

Around 2005 to 2006 the scaling broke down. As feature sizes approached the deep submicron and nanometer range, supply voltages could no longer be reduced in step with dimensions, because leakage currents and threshold voltage limits set a floor. Power density stopped staying constant and began to rise, which meant that simply cranking up the clock frequency would have produced chips that consumed and dissipated too much power. The free lunch of ever-higher clock speeds was over.

The end of Dennard scaling, more than the slowing of Moore’s Law itself, is what forced the industry’s pivot. Unable to make a single core much faster without overheating it, designers turned to putting multiple cores on a die and to specialized accelerators, shifting the burden of performance gains onto parallelism and architecture rather than raw frequency.

Sources

Related