Deep Dive into LLMs like ChatGPT

This is Andrej Karpathy’s long-form successor to his 2023 “Intro to Large Language Models” talk, posted to his own channel. Where the earlier video was an hour, this one runs about three and a half hours and is described by Karpathy as a general-audience deep dive into the full training stack behind ChatGPT-style models.

The talk follows a model from raw text to a finished assistant: pretraining data and tokenization, the neural network that predicts the next token, the supervised finetuning that turns a base model into a chatbot, and the reinforcement learning stage that shapes its behavior and reasoning. Along the way Karpathy builds intuition for why these systems hallucinate, where reasoning comes from, and how to think about their psychology as tools.

For a practitioner or an engaged decision-maker willing to invest the time, this is one of the most thorough firsthand explanations available. It is taught by someone who has trained these systems at OpenAI and Tesla, and it connects the headline behaviors of chat assistants back to concrete steps in how they are made.

Deep Dive into LLMs like ChatGPT

Sources

Related