The ASIC

An ASIC, or application-specific integrated circuit, is a chip designed and fabricated to do one particular job rather than to run arbitrary software. Where a general-purpose CPU is built to execute any program and a field-programmable gate array can be reconfigured after manufacture, an ASIC is etched into silicon for a fixed function and cannot be changed afterward. That permanence is exactly the point: by specializing the hardware to a single workload, an ASIC can be far faster and far more power-efficient at that workload than a general processor.

The trade-off is economic and practical. Designing and fabricating a custom chip costs a great deal up front and takes many months, and once the masks are made the design is frozen. ASICs therefore make sense when a task is both well-defined and high-volume enough to amortize that cost, or when its performance and power demands simply cannot be met by general-purpose parts. Inside that envelope, nothing beats a chip built for exactly one thing.

Google’s Tensor Processing Unit is a clear modern example, and the engineers describe its rationale in their own words. In “In-Datacenter Performance Analysis of a Tensor Processing Unit,” presented at the International Symposium on Computer Architecture in 2017, the authors state that “many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware,” and they evaluate “a custom ASIC — called a Tensor Processing Unit (TPU)” built to accelerate neural-network inference. By dropping the caches and out-of-order machinery a general CPU needs and committing the silicon to matrix multiplication, the TPU reported large gains in performance per watt over contemporary CPUs and GPUs.

Bitcoin mining is the other widely cited case. The proof-of-work computation is a single, fixed hashing operation repeated astronomically often, which is the ideal target for specialization. Mining migrated from CPUs to GPUs to FPGAs and finally to dedicated ASICs that do nothing but compute that one hash, at orders of magnitude better speed and efficiency than general hardware, and which are useless for anything else.

The ASIC anchors one end of the spectrum of how computation is realized in hardware. A CPU runs any software but pays for that generality; an FPGA offers reconfigurable hardware at the cost of speed and efficiency; an ASIC delivers the best possible performance and efficiency for its one task while giving up all flexibility. The choice among them is a choice about how fixed the problem is and how much volume justifies committing it permanently to silicon.

As specialization has become the main route to better performance now that general-purpose scaling has slowed, ASICs have grown more central rather than less. The same logic that produced the TPU drives custom accelerators across machine learning, networking, video, and cryptography, each a chip built for one job and unbeatable at it.