Deep Q-Networks learn Atari from pixels

In December 2013, a team at DeepMind led by Volodymyr Mnih published “Playing Atari with Deep Reinforcement Learning” on arXiv. The paper introduced the Deep Q-Network (DQN), a system that learned to play Atari 2600 video games using nothing but the raw screen pixels as input.

The abstract calls this the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The approach combined a convolutional neural network with a variant of Q-learning, a classic reinforcement learning method. Tested on seven Atari games with no game-specific tuning, the same architecture surpassed human performance on three of them.

The significance was the generality: one learning system, given only pixels and a score to maximize, could master very different games on its own. This blend of deep neural networks with reinforcement learning became a core DeepMind theme and a direct stepping stone toward later milestones like AlphaGo. (A more famous, expanded version appeared in the journal Nature in 2015 under the title “Human-level control through deep reinforcement learning.”)

Deep Q-Networks learn Atari from pixels

Sources

Related