In January 2025, DeepSeek released DeepSeek-R1. The arXiv paper, titled “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning” and authored by DeepSeek-AI, was submitted on January 22, 2025.
The abstract argues that reasoning ability can be developed through reinforcement learning without relying on human-annotated demonstrations. The authors report the “emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation,” and claim the model “achieves superior performance on verifiable tasks such as mathematics, coding competitions, and STEM fields” relative to models trained with conventional supervised methods. They also note that reasoning patterns from larger models can be distilled to improve smaller models.
DeepSeek-R1 mattered because it delivered strong reasoning performance as an openly available model from a Chinese lab, at training costs reported to be far lower than those assumed for leading Western systems. Its release prompted broad discussion about the economics of frontier AI, open versus closed models, and the global competitive landscape.