Deepfakes

Deepfakes are synthetic images, video, or audio in which a person’s likeness or voice is generated or swapped onto another using deep-learning methods. The term groups together face-swap techniques, generated faces, and voice cloning. The generative lineage runs through generative adversarial networks — introduced in “Generative Adversarial Nets” (Goodfellow et al., 2014), where two networks are trained against each other so that one learns to produce data that the other cannot distinguish from real samples — and, more recently, diffusion models.

The same generative progress that enables creative tools also enables convincing impersonation, which drove a parallel effort in detection. Meta (then Facebook) AI built and released the DeepFake Detection Challenge dataset, described in “The DeepFake Detection Challenge (DFDC) Dataset” (Dolhansky, Bitton, Pflaum, Lu, Howes, Wang, and Canton Ferrer, 2020). The authors describe it as “an extremely large face swap video dataset” with over 100,000 clips sourced from 3,426 paid actors, generated using “several Deepfake, GAN-based, and non-learned methods.” They report that “a Deepfake detection model trained only on the DFDC can generalize to real ‘in-the-wild’ Deepfake videos,” and note that all subjects “agreed to participate.”

Deepfakes sit at the intersection of generative capability and authenticity: the harder synthetic media is to distinguish from real recordings, the more weight falls on detection research, provenance signals, and policy. This entry anchors that arc to the primary generative and detection research; platform and government policies are not reproduced here unless quoted from primary sources.

Sources

Related