The Intelligence Processing Unit, or IPU, is an AI processor designed by the British company Graphcore as an alternative to GPUs for machine learning. Its architecture takes a different approach to parallelism: rather than a smaller number of large cores, each IPU contains many independent cores running thousands of parallel program threads. Graphcore’s second-generation Colossus MK2 chip, the GC200, holds 1,472 cores running nearly 9,000 threads, built from 59.4 billion transistors on a 7-nanometer process and delivering 250 teraflops of AI compute.
The IPU’s distinguishing feature is its memory strategy. Instead of relying mainly on external high-bandwidth memory, each MK2 IPU carries about 900 megabytes of memory directly on the chip, which Graphcore calls In-Processor Memory. Keeping model state close to the cores aims to sidestep the bandwidth bottleneck that limits performance when data must travel to and from off-chip memory. Multiple IPUs are packaged into systems such as the IPU-M2000, which provides about a petaflop of AI compute.
Graphcore’s IPU was one of the most prominent attempts by a startup to break NVIDIA’s grip on AI hardware with a fundamentally different design. For a general reader, the IPU illustrates the broader industry search for architectures that fit the memory-bound, highly parallel shape of neural networks better than chips originally built for graphics.