word2vec makes word embeddings practical

In January 2013, Tomas Mikolov, Kai Chen, Greg Corrado, and Jeff Dean of Google published “Efficient Estimation of Word Representations in Vector Space” on arXiv (1301.3781). The paper introduced word2vec, two efficient model architectures (Continuous Bag-of-Words and Skip-gram) that learn high-quality vector representations of words from very large text corpora at low computational cost.

The learned vectors placed semantically and syntactically similar words near one another and famously captured relationships through vector arithmetic, such that operations on word vectors could solve analogies like king is to queen as man is to woman. The paper reported high accuracy on word similarity tasks while training on billions of words in under a day.

word2vec was a turning point for natural language processing, making dense word embeddings practical and widespread. It laid groundwork for the representation-learning ideas that power today’s large language models.

Sources

Last verified June 6, 2026