Back to the wiki

Embeddings

Turning meaning into coordinates a machine can compare.

The analogy

Imagine a giant map where every word or phrase has an address: “dog” and “puppy” live on the same street, “car” is in another neighborhood and “invoice” in another city. An embedding is that address: numbers that place each text on the map of meaning, so that “close” means “similar”.

In detail

An embedding is a vector (hundreds or thousands of numbers) representing the meaning of a text, image or audio clip. Similar texts produce nearby vectors, which enables semantic search: comparing distances instead of exact words. It's the piece that powers RAG, recommenders and duplicate detection.

An example

You search “how do I return an order?” and the system finds the document titled “Refund policy” even though they don't share a single word: their embeddings are close on the map.

Related concepts