Diff and Patch

A “diff” is a compact description of how one version of a file differs from another: which lines were added, which were removed, and where. Instead of sending an entire updated file, you can send just the diff, which is usually tiny by comparison. The GNU diffutils manual puts the everyday use plainly: “You can use the set of differences produced by diff to distribute updates to text files (such as program source code) to other people.”

A “patch” is the other half of the pair: a program that takes a diff and applies it to a copy of the original file, reproducing the changed version. The GNU manual gives a clean way to picture it: “If you think of diff as subtracting one file from another to produce their difference, you can think of patch as adding the difference to one file to reproduce the other.” The output of the Unix diff command was designed to be mechanically re-applied from the start, as the 7th Edition manual shows in its ability to emit an editor script that turns one file into the other.

The patch tool that gave this workflow its name was written by Larry Wall in 1985, a few years before he created the Perl language. His patch program could take the diffs people posted to Usenet newsgroups and apply them even when the recipient’s copy had drifted slightly, making it practical to ship software fixes as small text snippets to strangers.

This diff-and-patch convention became the lingua franca of open collaboration. For decades the Linux kernel was developed largely by mailing diffs to a public list, where maintainers reviewed and applied them. The same idea underlies modern code review and the pull request: a change is still, at its core, a diff that someone proposes and someone else applies.