Embedding

By Paul Brock·Updated on 22-04-2026

TL;DR

An embedding is a numerical vector representation of text, image or other data, letting AI systems compute semantic similarity.

An embedding model turns each piece of text into a vector of hundreds to thousands of floating-point numbers. Text with similar meaning lies close together in that vector space. Applications: semantic search, clustering, classification, RAG retrieval. Popular embedding models: OpenAI text-embedding-3, Voyage AI, Cohere, and open-source BGE-M3.

Example

A knowledge base with 10,000 documents is embedded. Query 'how does open banking work' generates a query embedding; the vector database finds semantically similar documents — even those not literally mentioning 'open banking' but 'PSD2 API'.

Frequently asked questions

Embeddings vs keywords?

Embeddings: semantic ('car' matches 'vehicle'). Keywords: lexical (exact match). Modern search is hybrid: combine both for best results.

Embedding

Example

Frequently asked questions

Related terms

Further reading

Need help with SEO or GEO?