Embeddings

An embedding turns something hard for computers to compare, such as a word or a product, into a list of numbers (a vector) arranged so that similar items end up near each other in that numeric space. A foundational example is the word2vec method from Mikolov, Chen, Corrado, and Dean’s 2013 paper “Efficient Estimation of Word Representations in Vector Space,” which learns high-quality word vectors from large text and captures both syntactic and semantic relationships.

A striking property the paper demonstrated is that these vectors capture relationships through arithmetic, so that the vector for “king” minus “man” plus “woman” lands near “queen.” This showed that meaning could be encoded geometrically, with direction and distance carrying information.

Embeddings now underpin search, recommendation, and the inner workings of large language models, which represent words and tokens as learned vectors. They let systems measure similarity and meaning numerically rather than matching exact text.

Why business readers should care: Embeddings are what make semantic search, recommendation engines, and retrieval-augmented AI possible. They let software find things that are conceptually related, not just identically worded, which is the basis for many practical AI features in products today.

Sources

Related