Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

txtai Similarity really slow with ElasticSearch

See original GitHub issue

I’ve noticed when running ElasticSearch and txtai.pipeline for Similarity, the search (ranksearch) is very slow. When trying to search for 1 item, it can take upto 10 seconds.

The code I’m using is:

from txtai.pipeline import Similarity
from elasticsearch import Elasticsearch, helpers

# Connect to ES instance
es = Elasticsearch(hosts=["http://localhost:9200"], timeout=60, retry_on_timeout=True)

def ranksearch(query, limit):
  results = [text for _, text in search(query, limit * 10)]
  return [(score, results[x]) for x, score in similarity(query, results)][:limit]

def search(query, limit):
  query = {
      "size": limit,
      "query": {
          "query_string": {"query": query}
      }
  }

  results = []
  for result in es.search(index="articles", body=query)["hits"]["hits"]:
    source = result["_source"]
    results.append((min(result["_score"], 18) / 18, source["title"]))
  return results

similarity = Similarity("valhalla/distilbart-mnli-12-3")

limit = 1
query = "Bad News"
print(ranksearch(query, limit))

Issue Analytics

State:
Created a year ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

davidmezzetticommented, Aug 5, 2022

No problem, glad I could help.

For reference, a laptop I have is 5+ years old, has a quad core CPU and a 8 GB GPU with 1920 CUDA cores. Modest specs compared to the most recent hardware. I’ve run benchmarks on this hardware below just to give you an idea of what you should expect.

import timeit
from txtai.embeddings import Embeddings
from txtai.pipeline import Similarity

# d is a list of 100 text elements (same count as code above when limit=10)

# GPU
similarity = Similarity("valhalla/distilbart-mnli-12-3")
timeit.timeit(lambda: similarity("query", d), number=25) / 25
# 1.55s per call

embeddings = Embeddings({"path": "sentence-transformers/all-MiniLM-L6-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.45s per call

embeddings = Embeddings({"path": "sentence-transformers/paraphrase-MiniLM-L3-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.30s per call

# CPU only
similarity = Similarity("valhalla/distilbart-mnli-12-3")
timeit.timeit(lambda: similarity("query", d), number=25) / 25
# 19.16s per call

embeddings = Embeddings({"path": "sentence-transformers/all-MiniLM-L6-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 1.18s per call

embeddings = Embeddings({"path": "sentence-transformers/paraphrase-MiniLM-L3-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.66s per call

GPU prices have really come down lately. A RTX 3060 is ~$500 and there are RTX 3090s out there for around $1,100. A year ago those were 2.5-3x more expensive.

A RTX 3060 has 3,584 CUDA cores with 12GB of memory and a RTX 3090 has 10,496 with 24GB of memory. The elapsed time per call would be much lower on either of those. Server class NVIDIA GPUs are typically Quadro, V100, A100.

1reaction

Xyphiuscommented, Aug 5, 2022

That does make quite the difference. Setting the limit to ten yields results around 4.3 seconds.

I will look into setting up something with respect to GPU processing. I greatly appreciate your help.

Top Results From Across the Web

python 3.x - txtai ElasticSearch Similarity slow - Stack Overflow

In using txtai, I've noticed that it is abysmally slow. Requesting for one result and my response time is almost 10 seconds vs...

Add semantic search to Elasticsearch - neuml/txtai - GitHub

txtai has a similarity function that works on lists of text. This method can be integrated with any external search service, such as...

Similarity - txtai

Computes the similarity between query and list of text. Returns a list of (id, score) sorted by highest score, where id is the...

Slow cosine similarity script - Elasticsearch - Elastic Discuss

Hi, in a query, I am executing a cosine similarity script. It takes multiple seconds, but the top command shows CPU and memory...

Introducing txtai, AI-powered semantic search built ... - Medium

txtai builds sentence embeddings to perform similarity searches. txtai takes each text record entry, tokenizes it and builds an embeddings representation of ...