Implement Dense Vector Similarity Search in Elasticsearch (2026)

Elasticsearch doesn’t just store text; it can directly compare the meaning of vectors, letting you find documents that are semantically similar, not just keyword matches.

Let’s see this in action. Imagine you have product descriptions and want to find items that are conceptually alike, even if they use different words.

PUT /products
{
  "mappings": {
    "properties": {
      "description": {
        "type": "text"
      },
      "description_vector": {
        "type": "dense_vector",
        "dims": 3,
        "index": true,
        "similarity": "cosine"
      }
    }
  }
}

POST /products/_doc/1
{
  "description": "A comfortable cotton t-shirt for everyday wear.",
  "description_vector": [0.1, 0.5, 0.2]
}

POST /products/_doc/2
{
  "description": "Soft linen trousers, perfect for summer.",
  "description_vector": [0.3, 0.4, 0.1]
}

POST /products/_doc/3
{
  "description": "A breathable crewneck shirt made from organic cotton.",
  "description_vector": [0.15, 0.55, 0.25]
}

Now, let’s search for items similar to "a soft, light shirt". We’ll represent this query as a vector. The exact vector values depend on your embedding model, but for demonstration, let’s use [0.12, 0.52, 0.23].

GET /products/_search
{
  "knn": {
    "field": "description_vector",
    "query_vector": [0.12, 0.52, 0.23],
    "k": 2,
    "num_candidates": 5
  },
  "_source": ["description"]
}

The knn (k-Nearest Neighbors) search is the core. query_vector is your search term’s vector representation. k is how many top results you want. num_candidates is a trade-off for performance: it’s how many potential matches Elasticsearch checks before returning the top k. A higher num_candidates means more accuracy but slower search.

The problem this solves is moving beyond keyword matching to semantic understanding. Traditional search engines look for exact word matches. If you search for "running shoes," you won’t get "sneakers for jogging" unless those exact words are present. Dense vector search, powered by pre-trained machine learning models (like those generating the description_vector values), captures the meaning or intent behind words. Documents with vectors close to the query vector are considered semantically similar.

Internally, Elasticsearch uses specialized indexing algorithms (like HNSW - Hierarchical Navigable Small Worlds) for dense_vector fields. These algorithms build a graph-like structure of your vectors, allowing for efficient approximate nearest neighbor search. Instead of comparing your query vector to every single vector in the index (which would be O(N)), it navigates this graph to find close neighbors much faster. The similarity setting (e.g., cosine, dot_product, l2) defines how "closeness" is calculated between vectors. Cosine similarity measures the angle between vectors, ignoring their magnitude, which is often ideal for semantic similarity where direction matters more than scale.

When you define a dense_vector field, dims is crucial. It must match the dimensionality of the vectors generated by your embedding model. If your model outputs 768-dimensional vectors, dims must be 768. Mismatching dimensions will cause indexing errors or incorrect search results. index: true is the default and tells Elasticsearch to build the special kNN index for fast searching.

The num_candidates parameter in the knn query directly influences the trade-off between search speed and accuracy. It dictates how many nodes in the HNSW graph are explored. A value of 50 means Elasticsearch will explore up to 50 nodes to find the nearest neighbors. Increasing this value (e.g., to 100 or 200) can improve recall (finding more relevant results) at the cost of increased latency. Conversely, reducing it speeds up queries but might miss some highly relevant documents. It’s a tunable parameter that depends heavily on your specific dataset size, query load, and acceptable latency.

The next hurdle is often dealing with very large datasets where even HNSW might become too slow or memory-intensive.