Literate Programming

Literate programming is a methodology introduced by Donald Knuth in his paper “Literate Programming,” published in The Computer Journal in May 1984 (Volume 27, Issue 2, pages 97-111). Its central conviction is that a program should be treated as a piece of literature, addressed to human beings rather than to a computer. Knuth describes the approach as one that “combines a programming language with a documentation language, thereby making programs more robust, more portable, more easily maintained.”

In a literate program the author does not write source code with comments bolted on. Instead the author writes an explanation, in natural language, of what the program does and why, and embeds fragments of code at the points in the narrative where they are introduced. The pieces of code may be presented in whatever order best explains the design to a reader, even if that differs sharply from the order a compiler requires. Each fragment is given a name, and named fragments can be referenced inside other fragments, letting the program be assembled from human-sized chunks of thought.

To make this practical, two automatic processes extract the two faces of the document. One, traditionally called “weave,” produces a typeset version, a publishable, cross-indexed document with prettyprinted code, an index of identifiers, and the author’s prose. The other, called “tangle,” rearranges the same fragments into the order the compiler needs and emits a plain source file ready to build. Because both outputs derive from one master file, the documentation cannot drift out of sync with the code; there is only ever one source of truth.

Knuth devised the technique while building TeX and METAFONT, and those large systems were themselves written as literate programs. His original tool was named WEB, and he has wryly noted that he “used the word WEB for this purpose long before CERN grabbed it” for the World Wide Web. He found the discipline so valuable that he published the full annotated source of TeX and METAFONT as books, Volumes B and D of Computers and Typesetting, as demonstrations that a program of such size could be presented clearly and completely.

Knuth collected his essays on the subject in the book Literate Programming (Stanford, Center for the Study of Language and Information, 1992), which his Stanford page calls an anthology treating “a program as a piece of literature, addressed to human beings rather than to a computer.” The idea influenced later documentation-driven tools and is often cited as an ancestor of computational notebooks such as Jupyter, though literate programming in Knuth’s strict sense, with reordering and tangling, remains a specialized practice.