Embedding
An embedding is a numerical vector representation of text, image or other data, letting AI systems compute semantic similarity.
An embedding model turns each piece of text into a vector of hundreds to thousands of floating-point numbers. Text with similar meaning lies close together in that vector space. Applications: semantic search, clustering, classification, RAG retrieval. Popular embedding models: OpenAI text-embedding-3, Voyage AI, Cohere, and open-source BGE-M3.
Example
A knowledge base with 10,000 documents is embedded. Query 'how does open banking work' generates a query embedding; the vector database finds semantically similar documents — even those not literally mentioning 'open banking' but 'PSD2 API'.
Frequently asked questions
Embeddings vs keywords?
Embeddings: semantic ('car' matches 'vehicle'). Keywords: lexical (exact match). Modern search is hybrid: combine both for best results.
Related terms
Further reading
- → Our service: AI sector