On September 11, 2024, Google added a feature called Audio Overviews to NotebookLM, its document-grounded research assistant. The premise was unusually concrete: upload your own sources, and the tool would “turn your documents into engaging audio discussions.” Two synthetic AI hosts start up “a lively ‘deep dive’ discussion based on your sources,” in which they “summarize your material, make connections between topics, and banter back and forth,” and the user can “download the conversation and take it on the go.”
What made Audio Overviews break out was how natural the result sounded. The two AI voices interrupted each other, reacted, and bantered with a conversational texture that felt closer to a real podcast than to a text-to-speech reading. People fed it everything from research papers and legal contracts to their own resumes and LinkedIn profiles, and the novelty of hearing two AIs earnestly discuss your material spread quickly online. It was a rare example of a research tool going viral on the strength of a single feature.
Google was explicit about the limits. The generated discussions are “not a comprehensive or objective view of a topic,” the hosts “only speak English, sometimes introduce inaccuracies, and you can’t interrupt them yet,” and generating the audio could take several minutes. The hosts were, in effect, confident narrators of whatever you gave them, which is useful for review and dangerous if mistaken for fact-checking.
For business readers, Audio Overviews pointed at a broader shift: generative AI moving from producing text you read to producing media you consume passively. Turning a stack of documents into a listenable briefing is a genuinely new format for internal training, onboarding, and research digestion, and NotebookLM showed that the format had immediate, almost playful appeal well before the accuracy questions were fully solved.