GSI Technology

Scalable Semantic Vector Search with Elasticsearch

Author: Pat Lasserre

Elasticsearch is a popular open-source full-text search engine that can search many types of documents, and it recently added a dense_vector field type that stores dense vectors of float values. These dense_vectors can be used for document scoring, which helps determine the relevance of a document.

There has been a lot of excitement in the natural language processing (NLP) community surrounding the Elasticsearch dense_vector field because it helps open the door for a move away from traditional keyword search to semantic search. Semantic search allows for a better understanding of the searcher’s intent and thus can help provide better-quality search results.

Along with this excitement, however, there has also been some expressed disappointment in the community about the lack of a scalable approximate nearest neighbor (ANN) solution in Elasticsearch that leverages the dense_vector field.

In this post we will introduce an Elasticsearch plugin that uses the dense_vector field to provide high-performance, billion-scale ANN semantic vector search.

Understanding User Intent

Traditional search engines used keywords to find literal matches for those keywords. One problem with this approach is that it could omit documents that do not share many keywords with the query, yet are relevant documents.

One way that today’s search engines are trying to address this problem is by using dense vector embeddings that allow for search with meaning. These vector embeddings are a representation of text (words, sentences, paragraphs) that encode semantic understanding into them. This new approach is often referred to as semantic search, and it tries to better understand the searcher’s intent and the contextual meaning of the query. Instead of simply matching the keywords, it takes into consideration what the words mean and not just the words themselves.

Elasticsearch Dense_Vector Field Limitations

Elastic wrote a good blog post in which they explained how their dense_vector field type can be used to support semantic search in applications such as: question-answering, article search, and image search (using an image’s caption to search on).

In that post, Elastic also discussed their script_score query, which uses the dense_vector field to “rank questions based on their similarity to the user’s query.”

Elastic then went on to introduce a key limitation of the script_score query: “The script_score query is designed to wrap a restrictive query, and modify the scores of the documents it returns. However, we’ve provided a match_all query, which means the script will be run over all documents in the index. This is a current limitation of vector similarity in Elasticsearch — vectors can be used for scoring documents, but not in the initial retrieval step. Support for retrieval based on vector similarity is an important area of ongoing work.”

The reason that the script_score can’t be used for the initial retrieval step (also commonly referred to as candidate generation) is because it doesn’t support approximate nearest neighbor search (ANN) — it does an exhaustive “match_all” search using cosine similarity to measure similarity against all items in the list. Cosine similarity is relatively computationally expensive, so that is why they have to first run a restrictive query in order to limit the number of vectors they measure similarity against. Elastic recognizes that not supporting the initial retrieval step is a key limitation, so they mention that there is “ongoing work” to investigate ANN methods.

We have previously written about how ANN is the key to scaling similarity search and how it is used in the retrieval/candidate generation step of the similarity search pipeline to quickly funnel down from billions of candidates to the most relevant few thousand or so.

The Call for Elasticsearch ANN Support

Support for ANN in Elasticsearch is something that is actively being requested by Elasticsearch users on Elastic’s GitHub page: One user commented that “a scalable solution to nearest vector search within Elasticsearch would be very useful. Much of the rest of our search stack is Elasticsearch, so moving ANN into Elasticsearch is more attractive than monolithic ANN systems (e.g., Faiss, Annoy, etc.).”

Another user commented that “with more searches either completely or partially shifting to semantic-based approaches, it’s really important to have at least some implementations of ANN search. Since Elasticsearch is already integrated with our systems it would be great if this could be done, instead of us having to move to something like FAISS or Annoy.”

Since many companies already use Elasticsearch, having support for ANN in Elasticsearch would be very useful to them, as the two comments above show. It would be much easier for them to simply use an ANN extension in Elasticsearch instead of having to try to integrate popular search libraries such as Faiss or Annoy into their similarity search pipeline.

A Scalable Elasticsearch ANN Plugin

In addition to reading comments such as the two presented above, we have also received direct input that people are looking for a high-performance Elasticsearch solution that scales to billion-scale vector similarity search. That is why we developed an Elasticsearch ANN plugin. Plugins, as Elastic states, “are a way to enhance the core Elasticsearch functionality in a custom manner.”

The GSI Elasticsearch ANN plugin uses the standard Elasticsearch dense_vector field to perform an ANN vector similarity search on GSI Technology’s Gemini® Associative Processing Unit (APU). Since the plugin performs an ANN search and not an exhaustive “match_all” search, it can be used for the initial retrieval/candidate generation step and allows for billion-scale vector similarity search.

Installing the plugin is easy and it allows for vector similarity search to be run as simply as any standard Elasticsearch query. It provides similarity search results in the standard Elasticsearch format, and since the plugin uses the core Elasticsearch dense_vector field type and index mapping, there is no need to reindex documents.

The plugin also allows for multimodal search, where vectors (images) and text are combined to form a more tailored search. A good example of how one could combine a text filter with a vector search was given by Walmart Labs in a paper.

In that paper, Walmart Labs showed how a user can add text filters to refine the results of a vector similarity search. In their example, as seen in Figure 1 below, they first did a vector similarity search to find similar images to a query image and then filtered those results using color and price filters.

Figure 1. (Source: Walmart Labs) Combining vector image search with text filters for a more refined search.

Conclusion

The addition of the dense_vector field type to Elasticsearch helps open the world of embeddings and semantic vector search to Elasticsearch.

Elastic’s current semantic vector search offering, however, is limited because with their implementation, “vectors can be used for scoring documents, but not in the initial retrieval step.” This is because their solution does an exhaustive search, which is computationally expensive and thus doesn’t scale well enough to handle the initial retrieval step in a vector similarity search pipeline.

To address this issue, GSI Technology has created a simple-to-use Elasticsearch ANN plugin that supports approximate nearest neighbor (ANN) vector similarity search using the Elasticsearch dense_vector field type. This enables Elasticsearch to support the initial retrieval step and paves the way for billion-scale semantic vector similarity search using Elasticsearch.

We presented the plugin at a recent Elastic Meetup. Here is a video of the presentation.

For more information about the plugin, contact us at [email protected].

High Performance, High Reliability, Low Power

Intelligence, Accelerated

Data to Knowledge to Insight to Action

Accelerating LLM & VLM Inference

Massive Search. Minimal Latency.

High performance compute-in-memory

Bringing real-time datacenter SAR processing to mobile applications

Next Gen Navigation

Can’t hide from the APU

Safer. Smarter. More Aware.

Safety and Security

Technology that benefits human health

Efficiently processing big data for A&D applications

Bringing real-time datacenter SAR processing to mobile applications

APU finds the needle in the haystack

Next Gen Navigation

Advanced threat detection

From stringent space-grade memory to AI acceleration

From stringent space-grade memory to AI acceleration

The highest performance, highest density monolithic SRAMs

From stringent space-grade memory to AI acceleration

From stringent space-grade memory to AI acceleration

Visionary thought leadership

AI’s Energy Dilemma

Accelerating AI Through Partnerships

Create, maintain, debug

Scalable Semantic Vector Search with Elasticsearch

Understanding User Intent

Elasticsearch Dense_Vector Field Limitations

The Call for Elasticsearch ANN Support

A Scalable Elasticsearch ANN Plugin

Conclusion

Company Info

AI & Compute

Aerospace & Defense

Memory

Quick Links

FOLLOW US