The AI Library

A knowledge base about artificial intelligence where every fact links back to its original source - the paper, the announcement, the talk. No articles about articles.

422 entries, all primary-sourced
Coming Soon The Story of AI - a multi-part series (blog, podcast, and YouTube) walking through this entire timeline, openly produced with AI. See what's coming ->

Milestones

The events that shaped AI, in order.

milestone September 1, 1843

Lovelace's Notes and the first published algorithm

Ada Lovelace's translation of Menabrea's memoir on the Analytical Engine added Notes that included the first published algorithm and the warning that the machine could only do what it was ordered to do.

milestone January 1, 1854

Boole's Laws of Thought

George Boole's 1854 book recast logic as algebra, giving the world the symbolic system that digital hardware would later implement directly.

milestone November 12, 1936

Turing's On Computable Numbers

Alan Turing's 1936 paper defined the abstract machine that bears his name and proved that no general procedure can decide every mathematical question.

milestone March 1, 1942

Asimov's Three Laws of Robotics

Isaac Asimov's 1942 story Runaround introduced the Three Laws of Robotics, the first widely known attempt to state safety rules for autonomous machines.

milestone January 1, 1943

The first mathematical neuron model

McCulloch and Pitts showed that simple all-or-none neurons wired into networks can compute any logical proposition.

milestone July 1, 1945

As We May Think

Vannevar Bush's 1945 essay imagined the memex, a personal machine that would store and cross-link a person's knowledge, anticipating hypertext and machine-augmented thought.

milestone January 1, 1948

Wiener publishes Cybernetics

Norbert Wiener's 1948 book Cybernetics unified feedback and control across machines and living things, supplying a vocabulary that shaped early thinking about intelligent systems.

milestone January 1, 1949

Neurons that fire together, wire together

Donald Hebb proposed that learning happens when connections between co-active neurons strengthen, giving neural networks a rule for learning.

milestone January 1, 1959

Arthur Samuel's checkers program

Arthur Samuel's self-improving checkers program popularized the term machine learning and showed computers could learn from experience.

milestone January 1, 1965

I.J. Good and the Intelligence Explosion

Mathematician I.J. Good's 1965 paper introduced the idea of an ultraintelligent machine that designs ever-better machines, producing an intelligence explosion.

milestone April 19, 1965

Gordon Moore predicts the doubling of components on a chip

Gordon Moore's 1965 magazine article observed that the number of components on an integrated circuit was doubling roughly every year, the forecast that became Moore's Law and the metronome of cheap computing.

milestone January 1, 1966

1966 ALPAC report ends a decade of machine-translation funding

The 1966 ALPAC report concluded there was no immediate or predictable prospect of useful machine translation and recommended cutting funding, collapsing US support for the field for roughly two decades - before the term 'AI winter' existed.

milestone January 1, 1966

ELIZA, the first famous chatbot

Joseph Weizenbaum's ELIZA was an early conversational program that mimicked a psychotherapist using simple pattern matching.

milestone January 1, 1971

SHRDLU understands English in a world of blocks

Terry Winograd's SHRDLU let a person hold a typed English conversation with a computer that moved blocks in a simulated world, a landmark in early natural language understanding.

milestone January 1, 1976

MYCIN, the medical expert system

MYCIN was a Stanford expert system that advised doctors on diagnosing and treating blood infections using a few hundred rules and a way of handling uncertainty.

milestone August 1, 1980

XCON/R1, the first big commercial expert system

R1, later called XCON, was a rule-based expert system that configured Digital Equipment Corporation's VAX computer orders and became one of the first expert systems to pay off in business.

milestone January 1, 1987

The second AI winter

In the late 1980s the commercial AI boom collapsed as the specialized Lisp-machine market crashed and expert systems failed to live up to their hype, ushering in a long downturn.

milestone January 1, 1992

TD-Gammon teaches itself backgammon

Gerald Tesauro's neural network learned to play backgammon at near-champion level by playing millions of games against itself using temporal-difference learning.

milestone October 8, 2005

Stanley wins the DARPA Grand Challenge

Stanford's self-driving car Stanley completed a 132-mile desert course to win the 2005 DARPA Grand Challenge, a turning point for autonomous vehicles.

milestone June 23, 2007

NVIDIA releases CUDA

NVIDIA introduced CUDA, a platform that let developers run general-purpose code on GPUs, providing the compute engine that later powered deep learning.

milestone July 19, 2007

Checkers is solved

After 18 years of computation, Jonathan Schaeffer's team proved that checkers is a draw with perfect play, the most complex game solved at that time.

milestone March 1, 2009

The unreasonable effectiveness of data

Three Google researchers argue that simple models trained on enormous amounts of data beat elaborate models trained on less, a data-first creed that predates the scaling laws.

milestone June 20, 2009

ImageNet, a large-scale image database

Fei-Fei Li and colleagues introduce ImageNet, a labeled image database that became the proving ground for modern computer vision.

milestone April 1, 2010

Kaggle launches

Kaggle turned machine learning into a competitive discipline, hosting public prediction contests that became both a benchmark and a talent pipeline for the field.

milestone February 16, 2011

IBM Watson wins Jeopardy!

IBM's Watson question-answering system beat champions Ken Jennings and Brad Rutter on Jeopardy!, showing computers could understand and answer natural-language questions.

milestone December 3, 2012

AlexNet wins ImageNet 2012

A deep convolutional neural network crushed the ImageNet contest, proving deep learning could outperform hand-built computer vision.

milestone December 19, 2013

Deep Q-Networks learn Atari from pixels

DeepMind trained a single neural network to play Atari video games directly from raw pixels, beating humans on several titles.

milestone January 26, 2014

Google acquires DeepMind

Google buys the London AI lab DeepMind in January 2014, placing a leading research lab inside a big-tech company.

milestone July 3, 2014

Bostrom's Superintelligence

Nick Bostrom's 2014 book Superintelligence brought existential risk from advanced AI to a wide audience and shaped how lab founders talk about safety.

milestone September 10, 2014

Sequence to sequence learning

Sutskever, Vinyals, and Le showed neural networks could map input sequences to output sequences, enabling end-to-end translation.

milestone November 9, 2015

Google open-sources TensorFlow

Google released TensorFlow, its internal machine-learning system, as free open-source software, putting an industrial-grade deep learning framework in everyone's hands.

milestone December 10, 2015

ResNet trains ultra-deep networks

Residual connections let neural networks grow to 152 layers and win ImageNet 2015, unlocking much deeper architectures.

milestone December 11, 2015

OpenAI is founded

OpenAI launched in December 2015 as a nonprofit AI research company with a $1 billion commitment and a mission to ensure artificial general intelligence benefits all of humanity.

milestone March 15, 2016

AlphaGo defeats Lee Sedol

DeepMind's AlphaGo beat Go world champion Lee Sedol 4-1, a feat experts had thought was a decade away.

milestone May 18, 2016

Google reveals the Tensor Processing Unit

Google disclosed the Tensor Processing Unit, a custom chip built specifically for neural networks that had already been running in its data centers for over a year, marking the start of the custom-silicon era for AI.

milestone January 19, 2017

PyTorch released by Facebook AI Research

Facebook AI Research released PyTorch, a Python-first, define-by-run deep learning framework that became the dominant tool for AI research.

milestone January 30, 2017

Libratus beats poker professionals

Carnegie Mellon's Libratus defeated four top professionals at heads-up no-limit poker, the first AI to master a major game of imperfect information.

milestone June 12, 2017

The Transformer is introduced

Google researchers publish 'Attention Is All You Need', introducing the Transformer architecture that underpins modern AI.

milestone December 5, 2017

AlphaZero masters chess, shogi and Go

DeepMind's AlphaZero taught itself chess, shogi and Go from scratch by self-play, reaching superhuman strength in each and defeating the best existing programs.

milestone May 16, 2018

OpenAI's AI and Compute analysis

OpenAI's 2018 analysis found that the compute used in the largest AI training runs had been doubling every 3.4 months since 2012, making the empirical case that raw computation drives AI capability.

milestone June 11, 2018

OpenAI's first GPT

OpenAI's first GPT showed that pre-training a Transformer on large amounts of unlabeled text and then fine-tuning it set a new bar across many language tasks.

milestone July 11, 2019

Pluribus masters six-player poker

Pluribus, built by Carnegie Mellon and Facebook AI, beat elite professionals at six-player no-limit poker, the first superhuman result in a multiplayer imperfect-information game.

milestone July 22, 2019

OpenAI LP and the Microsoft partnership

In 2019 OpenAI restructures as a capped-profit company and Microsoft invests 1 billion dollars, creating the financial structure behind much of the AI boom that followed.

milestone October 9, 2019

The Transformers library

Hugging Face's open-source Transformers library put state-of-the-art language models one pip install away, becoming the distribution layer of the LLM era.

milestone September 16, 2020

NumPy documented in Nature

The foundational scientific-computing library underpinning the Python data and AI stack was formally documented in a Nature paper, 'Array Programming with NumPy.'

milestone January 1, 2021

The Pile: An Openly Documented LLM Training Dataset

EleutherAI released The Pile, an 825 GiB English text dataset built from 22 documented sources, making the contents of a large language model training corpus openly inspectable.

milestone January 5, 2021

OpenAI introduces DALL-E

OpenAI unveils the original DALL-E, a GPT-3-scale model that generates images from natural-language text descriptions - the start of the generative image era.

milestone January 5, 2021

OpenAI releases CLIP

CLIP learns visual concepts directly from natural-language supervision, becoming the bridge that made modern multimodal AI possible.

milestone March 1, 2021

On the Dangers of Stochastic Parrots

A 2021 FAccT paper by Bender, Gebru, McMillan-Major, and Mitchell that became the most-cited critique of the large-language-model paradigm.

milestone May 28, 2021

Anthropic founded as an AI safety company

Anthropic is founded by siblings Dario and Daniela Amodei and colleagues as an AI safety and research company, surfacing publicly with a 124 million dollar Series A.

milestone January 27, 2022

InstructGPT brings RLHF to GPT-3

OpenAI's InstructGPT used reinforcement learning from human feedback to make GPT-3 follow instructions, the direct precursor of ChatGPT.

milestone March 29, 2022

DeepMind's Chinchilla scaling result

DeepMind's Chinchilla paper showed that most large language models were undertrained and that model size and training data should grow together for compute-optimal results.

milestone April 13, 2022

OpenAI announces DALL-E 2

OpenAI unveils DALL-E 2, a system that generates and edits high-resolution images from natural-language text prompts.

milestone July 12, 2022

Midjourney opens public beta

Midjourney, an independent research lab's text-to-image generator run through Discord, opened to the public in July 2022 and became one of the dominant image AI tools.

milestone November 30, 2022

OpenAI launches ChatGPT

OpenAI releases ChatGPT, a conversational AI that reaches mass adoption and ignites the modern generative-AI boom.

milestone February 24, 2023

Meta releases LLaMA

Meta releases LLaMA, a family of foundational large language models, accelerating the open-weights movement.

milestone March 14, 2023

Anthropic introduces Claude

Anthropic launches Claude, a next-generation AI assistant built around helpful, honest, and harmless training.

milestone March 14, 2023

OpenAI releases GPT-4

OpenAI releases GPT-4, a multimodal model scoring in the top tiers of professional exams and marking a major capability jump.

milestone March 22, 2023

The Pause Giant AI Experiments Letter

A March 2023 open letter from the Future of Life Institute called on all AI labs to pause training systems more powerful than GPT-4 for at least six months.

milestone May 30, 2023

The Statement on AI Risk

A one-sentence May 2023 statement from the Center for AI Safety put extinction risk from AI alongside pandemics and nuclear war, signed by leading researchers and lab CEOs.

milestone November 1, 2023

The Bletchley Declaration

At the first global AI Safety Summit in November 2023, 28 countries and the European Union signed the Bletchley Declaration on managing frontier AI risk.

milestone December 6, 2023

Google launches Gemini

Google launches Gemini, a natively multimodal model family built jointly by Google DeepMind, in three sizes - Ultra, Pro, and Nano.

milestone December 27, 2023

The New York Times Sues OpenAI and Microsoft

The New York Times filed a copyright lawsuit against OpenAI and Microsoft in federal court over the use of its articles in training AI systems, putting the training-data question before the courts.

milestone March 4, 2024

Anthropic releases the Claude 3 model family

Anthropic launches Claude 3 in three models - Opus, Sonnet, and Haiku - which it says set new industry benchmarks across cognitive tasks and add vision capabilities.

milestone May 13, 2024

OpenAI releases GPT-4o

OpenAI releases GPT-4o, an 'omni' model trained end-to-end across text, vision, and audio that can hold real-time spoken conversations.

milestone July 12, 2024

The EU Adopts the AI Act (2024)

The European Union adopts Regulation (EU) 2024/1689, the world's first comprehensive law governing artificial intelligence through a risk-based framework.

milestone January 21, 2025

The Stargate Project

OpenAI, SoftBank, Oracle, and MGX announce Stargate, a new company intending to invest 500 billion dollars over four years in AI infrastructure in the United States.

milestone January 22, 2025

DeepSeek releases DeepSeek-R1

DeepSeek releases DeepSeek-R1, an open reasoning model trained largely via reinforcement learning, intensifying global LLM competition.

milestone May 22, 2025

Anthropic releases Claude Opus 4 and Sonnet 4

Anthropic releases Claude Opus 4 and Sonnet 4, models built to sustain focused work over thousands of steps, marking the shift to long-running agentic coding.

milestone August 7, 2025

OpenAI releases GPT-5

OpenAI releases GPT-5 as a unified system that automatically routes between fast answers and deeper reasoning, becoming the default model for ChatGPT users.

milestone November 18, 2025

Google releases Gemini 3

Google releases Gemini 3, its most capable model, launched across all its products on day one and topping major reasoning and multimodal benchmarks.

Cautionary Tales and Curiosities

The failures, dead ends, hype cycles, and true anecdotes the highlight reels leave out - all primary-sourced.

story April 1, 1836

The Mechanical Turk: the original AI fraud

A 1770 chess 'automaton' toured Europe and America for decades fooling audiences into believing a machine could play chess; in fact a human chess master was hidden inside the cabinet, and Edgar Allan Poe reasoned his way to that conclusion in 1836.

story January 1, 1950

The Three Laws of Robotics, and how the stories actually go

Isaac Asimov's Three Laws of Robotics became the public's default frame for robot ethics, but the stories that introduced them are mostly about the Laws producing strange, harmful, or paradoxical behavior - the failures are the point.

story August 31, 1955

The two-month, ten-man plan to crack AI

The 1956 Dartmouth proposal that named the field 'artificial intelligence' also estimated that a 2-month, 10-man summer study could make significant progress on it.

story July 7, 1966

Computer vision as a summer project

In 1966 MIT assigned 'the construction of a significant part of a visual system' as a summer project for students; making computers see well took roughly fifty more years.

story January 1, 1968

HAL 9000 and the public image of AI

The murderous shipboard computer HAL 9000 in Arthur C. Clarke's 1968 novel '2001: A Space Odyssey' became one of the most durable popular images of artificial intelligence, fixing the idea of a calm, capable machine that turns on its makers.

story January 1, 1969

The 17-year neural network freeze

Minsky and Papert's 1969 book Perceptrons exposed what single-layer networks could not do, helping freeze neural-network research and funding until backpropagation revived the field in 1986.

story January 1, 1976

Weizenbaum's secretary and the ELIZA effect

Joseph Weizenbaum built ELIZA to show how shallow a chatbot was, then watched people - including his own secretary - treat it as if it understood them, an experience that turned him into a critic of AI.

story January 1, 1980

The rise and fall of the Lisp machine industry

Around 1980 the MIT AI Lab spun out two companies, Symbolics and LMI, to sell purpose-built Lisp computers for AI work; both collapsed within a decade as cheap general-purpose workstations made specialized hardware obsolete.

story May 1, 1983

The expert systems boom and bust

In the early 1980s 'knowledge engineering' was promoted as the future of AI, sparking a wave of expert-systems companies; by the end of the decade the market for specialized expert-system hardware and tools had collapsed.

story January 1, 1995

'The vodka is good but the meat is rotten' never happened

The famous tale of a machine translating 'the spirit is willing but the flesh is weak' into 'the vodka is good but the meat is rotten' is an undocumented legend; the historian John Hutchins traced it and found no original occurrence.

story November 1, 1995

Cyc: forty years of hand-coding common sense

Doug Lenat's Cyc project has been hand-encoding human common-sense knowledge since 1984; decades and a person-century of effort later, it never became the general intelligence the approach promised.

story June 8, 2014

The chatbot that 'passed the Turing test'

In 2014 the University of Reading announced that a chatbot posing as a 13-year-old boy had passed the Turing test for the first time; the claim said more about the rules chosen than about machine intelligence.

story March 10, 2016

Move 37 and the move that answered it

In AlphaGo's 2016 match against Lee Sedol, the machine played a move so unlikely commentators thought it was a mistake; two games later Lee answered with a brilliancy of his own, and both moves became legend.

story April 12, 2016

When bots were going to be the new apps

In 2016 Facebook opened its Messenger Platform to bots and declared conversation the next computing interface; the bot wave fizzled and Facebook's own M assistant was wound down by 2018.

story October 19, 2016

Full self-driving, next year, every year

Tesla has repeatedly told the public that full self-driving was nearly ready, starting with its 2016 claim that every car it built already had the hardware for it - a milestone that kept moving.

story October 26, 2017

Capsule networks: a celebrated idea that stayed quiet

In 2017 Geoffrey Hinton and colleagues published capsule networks as a richer alternative to convolutional neural networks; the paper drew wide attention but capsules never displaced CNNs in practice.

story February 14, 2019

The model that was 'too dangerous to release'

In 2019 OpenAI declined to release the full GPT-2 model, citing fears of malicious use, then released it in stages over nine months after the feared harms did not materialize.

story June 11, 2022

The engineer who said LaMDA was sentient

In 2022 a Google engineer publicly argued that the LaMDA chatbot was sentient and published an edited interview as evidence; Google disagreed, said its review found no support, and later dismissed him.

story August 14, 2023

Books3 and the Data Reckoning

A rights-holder group had the Books3 dataset, roughly 200,000 pirated e-books used to train AI models, taken offline, exposing how unlicensed material had been baked into training corpora.

story November 17, 2023

The OpenAI board crisis

OpenAI's board fired CEO Sam Altman on a Friday in November 2023; within days he was reinstated and most of the board that fired him was gone.

story February 23, 2024

Gemini image generation gets history wrong

In February 2024 Google paused Gemini's ability to generate images of people after it produced historically inaccurate results, and published an explanation of what went wrong.

story March 12, 2024

Devin and the AI-agent hype

Cognition's March 2024 launch of Devin, marketed as 'the first AI software engineer,' ignited a wave of autonomous-agent hype built around its own performance claims.

story March 19, 2024

The Inflection acqui-hire

Inflection AI raised 1.3 billion dollars for its Pi assistant, then in March 2024 its founders and many staff moved to Microsoft and the consumer product was set aside - the era's defining acqui-hire.

story February 18, 2025

The 2024 AI gadget wave that flopped

In 2024 standalone AI devices like the Humane Ai Pin launched to enormous hype; the Ai Pin was discontinued within a year and Humane's assets were sold to HP in February 2025.

Concepts

Plain-language explanations of the ideas behind modern AI.

concept

AI Agents

AI systems that use a language model to plan, take actions through tools, observe results, and iterate toward a goal.

concept

AI Alignment

The effort to ensure AI systems pursue the goals and values their designers and users actually intend, and avoid harmful behavior.

concept

AI Compute

Compute, the total amount of calculation spent training a model, has become the field's key input, growing far faster than Moore's Law and forecastable enough to plan capability around.

concept

AI Regulation

The emerging body of laws, executive actions, and international declarations that governments use to govern the development and use of artificial intelligence.

concept

AI winter

A period when disappointment in artificial intelligence causes a sharp drop in funding, interest, and progress, following a stretch of inflated expectations.

concept

Algorithmic bias

The tendency of machine-learning systems to encode and amplify biases present in their training data, producing systematically unequal outcomes across groups.

concept

Artificial General Intelligence (AGI)

The idea of an AI system with broad, human-level competence across most tasks - a goal that is real but deliberately ill-defined and contested.

concept

Attention Mechanism

A technique that lets a model focus on the most relevant parts of its input when producing each piece of output.

concept

Backpropagation

The learning algorithm that efficiently trains multi-layer neural networks by propagating error signals backward through the network.

concept

Benchmark Contamination

When test questions leak into a model's training data, inflating its benchmark scores and making them an unreliable measure of true ability.

concept

Chain-of-Thought Prompting

Prompting a model to spell out intermediate reasoning steps, which improves its accuracy on complex, multi-step problems.

concept

Computer Vision

The field of getting computers to interpret images and video, which advanced from hand-crafted features through the data-driven ImageNet era to today's multimodal models.

concept

Constitutional AI

Anthropic's method for training a model to be harmless by having it critique and revise its own outputs against a written set of principles.

concept

Context Window

The maximum amount of text - measured in tokens - that a model can take in and work with at one time.

concept

Convolutional Neural Network

A neural network designed for images that detects visual features regardless of their position, foundational to modern computer vision.

concept

Cybernetics

The study of control and communication through feedback in machines and living things, an interdisciplinary framework that preceded and shaped early artificial intelligence.

concept

Deep Learning

Machine learning with many-layered neural networks that learn features directly from data, the approach behind the modern AI boom.

concept

Deep learning frameworks

Software libraries like TensorFlow and PyTorch that handle automatic differentiation and GPU computation, making it practical to build and train neural networks.

concept

Deepfakes

Synthetic media in which a person's likeness or voice is generated or swapped using deep-learning techniques, and the detection research built to identify it.

concept

Diffusion Models

A generative method that learns to create images by reversing a step-by-step noising process, powering tools like DALL-E 2 and Stable Diffusion.

concept

Embeddings

Representations that turn words, items, or data into lists of numbers so that similar things sit close together.

concept

Expert system

A type of AI program that captures a human expert's knowledge as a set of if-then rules and applies them to give advice or make decisions in a narrow domain.

concept

Fine-tuning

Adapting a pre-trained model to a specific task or behavior by training it further on a smaller, targeted dataset.

concept

Generative Adversarial Network (GAN)

A pair of neural networks that compete, one generating fake data and one judging it, producing strikingly realistic synthetic images and media.

concept

GPU Computing

Graphics chips turned out to be ideal neural network engines because both rendering and neural networks are built on massive parallel matrix math, and programmable GPUs made that power available for AI.

concept

Gradient Descent

An optimization method that repeatedly nudges a model's settings in the direction that most reduces its error.

concept

Hallucination

When an AI language model produces fluent, confident text that is factually wrong or unsupported by its inputs.

concept

In-Context Learning

A model's ability to learn a new task from examples placed in the prompt, without any change to its trained weights.

concept

Interpretability

Research aimed at understanding what is actually happening inside an AI model, so its behavior can be explained, trusted, and corrected.

concept

Large Language Model

An AI system trained on vast text to predict and generate language, able to perform many tasks from a single model.

concept

Long Short-Term Memory (LSTM)

A recurrent neural network design that remembers information over long sequences, enabling early advances in speech and language processing.

concept

Loss Function

A formula that scores how wrong a model's predictions are, giving training a single number to minimize.

concept

Machine Learning

A field where computer programs improve at a task by learning patterns from data and experience rather than being explicitly programmed.

concept

Mixture of Experts (MoE)

An architecture that routes each input to a few specialized sub-networks, growing total capacity without growing per-query cost.

concept

Model Context Protocol (MCP)

An open standard from Anthropic for connecting AI applications to external data sources, tools, and workflows in a uniform way.

concept

Model Distillation

Training a smaller, cheaper model to mimic a larger one, transferring much of its capability at a fraction of the cost.

concept

Multimodality

AI systems that work across more than one type of data - for example understanding images and text together rather than text alone.

concept

Natural Language Processing

The field of getting computers to work with human language, which evolved from hand-written rules to statistics to neural networks and finally to large language models.

concept

Neural Network

A computing model built from layers of simple interconnected units that adjust their connections to learn patterns from data.

concept

Open Weights vs Closed Models

The distinction between models whose trained parameters you can download and run yourself and those accessible only through a vendor's API - and why 'open source AI' is contested.

concept

Overfitting

When a model memorizes its training data, including noise, and fails to perform well on new, unseen data.

concept

Perceptron

The first trainable artificial neuron, introduced by Frank Rosenblatt in 1958, that learns to classify patterns from examples.

concept

Pre-training

Training a model on huge amounts of general text first, so it learns broad language skills before any task-specific tuning.

concept

Prompt Engineering

The practice of crafting the instructions and examples you give a model to steer it toward better, more reliable outputs.

concept

Quantization

Compressing a model by storing its numbers at lower precision, so it uses less memory and runs faster with little loss in quality.

concept

Reasoning Models

Models trained to spend extra computation thinking step by step before answering, trading speed for accuracy on hard problems.

concept

Reinforcement Learning

Training an agent to make decisions by rewarding good actions and penalizing bad ones through trial and error.

concept

Robotics

Robotics is the field of building machines that sense, plan, and act in the physical world, and it has historically lagged behind disembodied AI because moving atoms is harder than moving bits.

concept

Scaling Laws

The empirical finding that model performance improves predictably as you increase model size, training data, and computation.

concept

Supervised Learning

Training a model on labeled examples so it learns to predict the correct output for new, unseen inputs.

concept

Support Vector Machine

A classification method that finds the widest-margin boundary between two classes and uses kernels to draw curved boundaries, the leading machine learning method before deep learning.

concept

Symbolic AI (GOFAI)

The approach to AI that represents knowledge as symbols and produces intelligence by manipulating and searching through those symbols, the dominant paradigm from the 1950s into the 1980s.

concept

The Turing Test

Alan Turing's proposal to replace the question 'can machines think?' with a practical test of whether a machine's conversation is indistinguishable from a human's.

concept

Tokenization

The process of breaking text into small units (tokens) that a model can read, often using subword pieces to handle any word.

concept

Training Data

The collection of examples a machine learning model learns from, whose quality and coverage largely determine model performance.

concept

Training Data Pipeline

The provenance chain that turns raw web crawls into a model's training corpus: collection, filtering, deduplication, and weighted mixing of diverse sources.

concept

Transformer

The neural network architecture, built entirely on attention, that powers modern large language models.

concept

Unsupervised Learning

Finding structure, groupings, or patterns in data that has no labels or predefined correct answers.

concept May 8, 2024

AI for science

A pattern in which learned models replace or accelerate physical simulation and brute-force search across biology, weather, materials and mathematics.

concept January 21, 2025

Frontier lab economics

How frontier AI gets financed: capped-profit structures, big-tech compute partnerships, mega-scale infrastructure ventures, and acqui-hires, anchored to the labs' own governance and deal documents.

Landmark Papers

What the papers actually said - linked to the originals.

paper October 1, 1950

Computing Machinery and Intelligence

Alan Turing's 1950 paper that asked whether machines can think and replaced the question with a practical test - the imitation game, now called the Turing test.

paper June 10, 2014

Generative Adversarial Networks

The 2014 paper by Goodfellow and colleagues that pitted two neural networks against each other to generate realistic images, launching the GAN era.

paper June 12, 2017

Attention Is All You Need

The 2017 Google paper that introduced the Transformer architecture, the foundation of virtually all modern large language models.

People

The researchers and builders behind the breakthroughs.

person

Ada Lovelace

Nineteenth-century mathematician whose 1843 Notes on the Analytical Engine contained the first published algorithm and an early statement of what machines can and cannot do.

person

Alan Turing

British mathematician who proposed the 1950 imitation game (Turing test) and asked whether machines can think.

person

Allen Newell

American researcher who co-founded artificial intelligence, co-built the Logic Theorist, and proposed the physical symbol system hypothesis with Herbert Simon.

person

Andrej Karpathy

AI researcher and educator, a founding member of OpenAI and former Director of AI at Tesla.

person

Andrew Ng

Machine learning pioneer and AI educator; founder of DeepLearning.AI, founding lead of Google Brain, and adjunct professor at Stanford.

person

Arthur Samuel

IBM researcher who built one of the first self-improving game programs and is credited with coining the term 'machine learning' in his 1959 checkers paper.

person

Charles Babbage

Nineteenth-century English mathematician and inventor who designed the Difference Engine and the programmable Analytical Engine, the conceptual ancestor of the computer.

person

Claude Shannon

Founder of information theory whose 1950 paper on computer chess was among the first serious proposals for machine intelligence.

person

Dario Amodei

AI researcher and Anthropic co-founder and CEO, previously a research leader at OpenAI during the GPT-2 and GPT-3 era.

person

Demis Hassabis

Co-founder and CEO of DeepMind, behind AlphaGo and AlphaFold, and a 2024 Nobel laureate in Chemistry.

person

Emily M. Bender

Computational linguist at the University of Washington, lead author of the Stochastic Parrots paper and a prominent skeptic of large-language-model hype.

person

Fei-Fei Li

Stanford professor who created ImageNet and co-directs the Stanford Human-Centered AI Institute.

person

Frank Rosenblatt

American psychologist who built the perceptron, an early trainable neural network, at Cornell Aeronautical Laboratory.

person

Geoffrey Hinton

Neural network pioneer at the University of Toronto, 2024 Nobel laureate in Physics for work enabling machine learning.

person

Grant Sanderson

Math educator and creator of 3Blue1Brown, known for visual explanations of mathematics and deep learning.

person

Herbert Simon

American polymath who co-founded artificial intelligence, won the Turing Award and the Nobel Prize in Economics, and is known for bounded rationality and satisficing.

person

Ian Goodfellow

Machine learning researcher who invented generative adversarial networks and co-authored the Deep Learning textbook.

person

Ilya Sutskever

Deep learning researcher, AlexNet and seq2seq co-author, OpenAI co-founder, and founder of Safe Superintelligence Inc.

person

Jensen Huang

Co-founder, president and CEO of NVIDIA, the company whose GPUs and CUDA software became the hardware engine of the deep learning revolution.

person

John Hopfield

Physicist who invented the Hopfield network in 1982 and shared the 2024 Nobel Prize in Physics for foundational work enabling machine learning.

person

John Jumper

Scientist who led development of AlphaFold and shared the 2024 Nobel Prize in Chemistry for AI-driven protein structure prediction.

person

John McCarthy

Computer scientist who coined the term artificial intelligence, co-proposed the 1956 Dartmouth project, and invented the Lisp programming language.

person

Joseph Weizenbaum

MIT computer scientist who created the ELIZA chatbot and then became one of AI's most prominent ethical critics.

person

Judea Pearl

Computer scientist who invented Bayesian networks and built the modern mathematics of causality; winner of the 2011 ACM Turing Award.

person

Juergen Schmidhuber

AI researcher whose lab co-developed the LSTM recurrent neural network and pioneered many deep learning ideas; director of the KAUST AI Initiative and IDSIA.

person

Liang Wenfeng

Founder of the Chinese AI lab DeepSeek and of the quantitative hedge fund High-Flyer; he is the submitting and corresponding author behind DeepSeek's headline research papers.

person

Marvin Minsky

AI pioneer at MIT, co-originator of the 1956 Dartmouth project that named the field of artificial intelligence.

person

Mira Murati

OpenAI's chief technology officer through the ChatGPT era, who in 2025 founded the AI research and product company Thinking Machines Lab.

person

Mustafa Suleyman

A co-founder of DeepMind and of Inflection AI who in 2024 became EVP and CEO of Microsoft AI, leading the company's consumer AI products including Copilot.

person

Nick Bostrom

Philosopher and founding director of Oxford's Future of Humanity Institute whose 2014 book Superintelligence shaped the debate on existential risk from AI.

person

Noam Shazeer

A co-author of the Transformer paper and a pioneer of the mixture-of-experts approach, who co-founded Character.AI and then returned to Google to help lead its Gemini models.

person

Norbert Wiener

American mathematician who founded cybernetics, the study of control and communication through feedback in machines and living things.

person

Richard Sutton

A founder of modern reinforcement learning and author of the influential 2019 essay 'The Bitter Lesson' on the primacy of computation in AI.

person

Rodney Brooks

Robotics pioneer behind behavior-based robotics, former director of the MIT AI Lab and CSAIL, co-founder of iRobot and Robust.AI, known for his annually graded AI prediction scorecards.

person

Sam Altman

Technology executive and a co-chair named at the founding of OpenAI, the research organization behind ChatGPT.

person

Sebastian Thrun

Sebastian Thrun led Stanford's winning DARPA Grand Challenge team, founded Google's self-driving car project, and co-founded the online-education company Udacity.

person

Stuart Russell

Berkeley computer scientist, co-author of the standard AI textbook AIMA, and a leading advocate for rethinking AI around human control.

person

Timnit Gebru

Computer scientist known for the Gender Shades and Stochastic Parrots papers and founder of the Distributed AI Research Institute (DAIR).

person

Vladimir Vapnik

Co-creator of statistical learning theory, the VC dimension, and support vector machines, the methods that dominated machine learning before deep learning.

person

Yann LeCun

Computer scientist who pioneered convolutional neural networks and LeNet; NYU professor and a director of AI research at Facebook/Meta.

person

Yoshua Bengio

Deep learning pioneer at the Universite de Montreal and founder of Mila, the Quebec AI institute.

Talks

Firsthand talks and lectures worth your time.

talk

AlphaGo - The Movie

The full documentary on DeepMind's AlphaGo and its historic 2016 match against Go champion Lee Sedol.

talk

Intro to Large Language Models

A one-hour, general-audience tour of how large language models are trained, what they can do, and where their risks lie.

talk

Opportunities in AI

Andrew Ng's Stanford talk on where the real opportunities in AI are and how to build with them.

talk

Software Is Changing (Again)

Karpathy's argument that LLMs are a new kind of computer programmed in English, ushering in Software 3.0.

Organizations

The labs and companies driving the field.

organization

Alibaba Qwen team

The Qwen team at Alibaba Cloud, which develops the widely used open Qwen family of language, multimodal, and coding models.

organization

Anthropic

AI safety and research company, maker of the Claude AI models, founded by former OpenAI researchers including Dario and Daniela Amodei.

organization

Baidu

Chinese technology company that built ERNIE Bot, China's first mover in the ChatGPT wave of generative AI chatbots.

organization

Common Crawl

The nonprofit that has crawled the open web since 2008 and gives away the resulting corpus, which underlies much of the text used to train large language models.

organization

DeepMind (Google DeepMind)

AI research lab founded in 2010, acquired by Google in 2014, known for AlphaGo and AlphaFold, now part of Google DeepMind.

organization

DeepSeek

Chinese AI lab behind the open DeepSeek model lines, known for reasoning models trained with reinforcement learning.

organization

Google Brain

Google's deep learning research team, started in 2011, merged with DeepMind in 2023 to form Google DeepMind.

organization

Hugging Face

The open AI platform and community hub for sharing machine-learning models, datasets, and applications.

organization

IBM

The company behind Deep Blue and Watson, now building open enterprise AI through its Granite models and watsonx platform.

organization

LAION

The volunteer-driven nonprofit that released the open image-text datasets, including LAION-5B, used to train Stable Diffusion and other generative image models.

organization

Meta AI

Meta's AI division, heir to its FAIR research lab and developer of the open-weight Llama models.

organization

Microsoft

The software giant whose AI arc runs from the 2016 Tay chatbot through its multibillion-dollar OpenAI partnership to the Copilot product line.

organization

Mistral AI

French AI lab building frontier models, known for open-weight releases and enterprise deployment flexibility.

organization

NVIDIA

The hardware and software company whose GPUs and CUDA platform power most modern AI training and inference.

organization

OpenAI

AI research and deployment company founded in December 2015, creator of the GPT model series and ChatGPT.

organization

Stability AI

The company behind Stable Diffusion, the open text-to-image model that opened generative imaging to the public, followed by a turbulent 2024 leadership change.

organization

xAI

Elon Musk's AI company, founded in 2023, maker of the Grok models with a mission to understand the universe.

organization January 1, 1992

Boston Dynamics

Boston Dynamics is a robotics company spun out of MIT in 1992, known for three decades of legged robots and viral videos of machines that walk, run, and do backflips.

organization January 1, 2014

Allen Institute for AI (Ai2)

The Seattle non-profit founded by Paul Allen, known for Semantic Scholar and the fully open OLMo language models.

organization December 13, 2016

Waymo

Waymo is Alphabet's autonomous-driving company, spun out of the Google self-driving car project that began in 2009.

organization July 1, 2020

EleutherAI

The volunteer collective that grew from a Discord server into a non-profit lab and replicated GPT-class models in the open, including GPT-Neo, GPT-J, and Pythia.

Model Families

The major AI model families, from the developers own pages.

model

Claude (Anthropic model family)

Anthropic's family of Claude assistant models, organized into Opus, Sonnet, and Haiku tiers and delivered via API and apps.

model

GPT (OpenAI model family)

OpenAI's family of general-purpose generative pre-trained transformer models, delivered mainly through the OpenAI API and ChatGPT.

model

Grok (xAI model family)

xAI's family of Grok models, the AI assistants developed by Elon Musk's xAI and integrated with the X platform.

model

Mistral (Mistral AI model family)

French lab Mistral AI's models, mixing open-weight releases with commercial API models across general, code, and audio tasks.

model

Qwen (Alibaba model family)

Alibaba's Qwen family of language, vision, and image models, many released publicly through Hugging Face and GitHub.

Benchmarks

How AI is measured - each tied to the paper or site that defined it.

benchmark

GSM8K (Grade School Math 8K)

A dataset of about 8,500 grade-school math word problems that tests a model's multi-step arithmetic reasoning.

benchmark

HumanEval

A 164-problem test that checks whether a model can write working Python code from a natural-language description, graded by unit tests.

benchmark

Humanity's Last Exam (HLE)

A 2,500-question expert-level exam across many subjects, built to stay hard for frontier AI as easier benchmarks get saturated.

benchmark

LMArena (Chatbot Arena)

A live leaderboard that ranks AI chatbots by anonymous head-to-head human preference votes.

benchmark

MATH (Competition Mathematics Dataset)

A dataset of 12,500 challenging competition math problems, each with a full worked solution, used to measure mathematical problem-solving in AI.

benchmark

MLPerf

An industry benchmark suite from MLCommons that measures how fast computing systems can train and run AI models.

benchmark

SWE-bench

A benchmark that tests whether AI systems can resolve real GitHub issues by editing real codebases, graded by the projects' own tests.

Facts

Atomic, verifiable facts - every one tied to a primary source.

fact April 1, 1836

The Mechanical Turk hid a human chess player inside

The 18th-century chess 'automaton' known as the Mechanical Turk was operated by a human chess player concealed in its cabinet; Edgar Allan Poe argued in 1836 that 'some person is concealed in the box during the whole time of exhibiting the interior.'

fact January 1, 1950

Asimov's Three Laws of Robotics appear in 'I, Robot'

Isaac Asimov's Three Laws of Robotics are stated in his 1950 collection 'I, Robot,' attributed in-story to a fictional Handbook of Robotics, and the stories repeatedly show the Laws producing paradoxical or harmful behavior.

fact

AlexNet was trained on two consumer GPUs

The 2012 AlexNet network that launched the deep learning era was trained on two NVIDIA GTX 580 GPUs, hardware also sold to gamers, using an efficient GPU implementation of convolution.

fact

Chinchilla: double the model, double the data

DeepMind's Chinchilla paper found that for compute-optimal training, model size and training tokens should scale equally - each doubling of model size needs a doubling of data.

fact

Watson won $77,147 on Jeopardy!

IBM Watson finished its 2011 Jeopardy! match with $77,147, well ahead of human champions Ken Jennings and Brad Rutter; the winnings went to charity.

fact March 11, 2019

OpenAI LP capped first-round returns at 100x

When OpenAI announced its capped-profit structure in March 2019, it said returns for first-round investors were capped at 100 times their investment, with any excess owned by the nonprofit.

fact November 22, 2023

Sam Altman was fired and back within days

OpenAI's board removed Sam Altman as CEO on November 17, 2023, and OpenAI announced his return on November 22, 2023 - a span of less than a week.

fact January 21, 2025

Stargate intends to invest 500 billion dollars

The Stargate Project, announced January 21, 2025, is a new company that intends to invest 500 billion dollars over four years in US AI infrastructure, beginning with 100 billion deployed immediately.

fact July 21, 2025

AI scored 35 of 42 for IMO gold in 2025

At the 2025 International Mathematical Olympiad, Google DeepMind's Gemini with Deep Think solved five of six problems for 35 of 42 points - a certified gold-medal score.