AlexNet wins ImageNet 2012

In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto entered a deep convolutional neural network into the ImageNet image-recognition contest and won by a wide margin. Their NIPS 2012 paper, “ImageNet Classification with Deep Convolutional Neural Networks,” is widely seen as the moment that ignited the modern deep learning boom.

The paper reports that the network achieved top-1 and top-5 error rates of 39.7 percent and 18.9 percent on the ImageNet test data, results the authors describe as considerably better than the previous state of the art. The model contained roughly 60 million parameters across five convolutional layers followed by fully connected layers, and was trained on graphics processors (GPUs).

The result mattered because it showed that, given enough data and computing power, a neural network could learn to see better than decades of hand-engineered computer vision techniques. Within a few years nearly every serious image-recognition system was built on deep neural networks. This victory set the stage for deeper architectures such as ResNet and for the broad industrial adoption of deep learning.