question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

txtai Similarity really slow with ElasticSearch

See original GitHub issue

I’ve noticed when running ElasticSearch and txtai.pipeline for Similarity, the search (ranksearch) is very slow. When trying to search for 1 item, it can take upto 10 seconds.

The code I’m using is:

from txtai.pipeline import Similarity
from elasticsearch import Elasticsearch, helpers

# Connect to ES instance
es = Elasticsearch(hosts=["http://localhost:9200"], timeout=60, retry_on_timeout=True)

def ranksearch(query, limit):
  results = [text for _, text in search(query, limit * 10)]
  return [(score, results[x]) for x, score in similarity(query, results)][:limit]

def search(query, limit):
  query = {
      "size": limit,
      "query": {
          "query_string": {"query": query}
      }
  }

  results = []
  for result in es.search(index="articles", body=query)["hits"]["hits"]:
    source = result["_source"]
    results.append((min(result["_score"], 18) / 18, source["title"]))
  return results

similarity = Similarity("valhalla/distilbart-mnli-12-3")

limit = 1
query = "Bad News"
print(ranksearch(query, limit))

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
davidmezzetticommented, Aug 5, 2022

No problem, glad I could help.

For reference, a laptop I have is 5+ years old, has a quad core CPU and a 8 GB GPU with 1920 CUDA cores. Modest specs compared to the most recent hardware. I’ve run benchmarks on this hardware below just to give you an idea of what you should expect.

import timeit
from txtai.embeddings import Embeddings
from txtai.pipeline import Similarity

# d is a list of 100 text elements (same count as code above when limit=10)

# GPU
similarity = Similarity("valhalla/distilbart-mnli-12-3")
timeit.timeit(lambda: similarity("query", d), number=25) / 25
# 1.55s per call

embeddings = Embeddings({"path": "sentence-transformers/all-MiniLM-L6-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.45s per call

embeddings = Embeddings({"path": "sentence-transformers/paraphrase-MiniLM-L3-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.30s per call

# CPU only
similarity = Similarity("valhalla/distilbart-mnli-12-3")
timeit.timeit(lambda: similarity("query", d), number=25) / 25
# 19.16s per call

embeddings = Embeddings({"path": "sentence-transformers/all-MiniLM-L6-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 1.18s per call

embeddings = Embeddings({"path": "sentence-transformers/paraphrase-MiniLM-L3-v2"})
timeit.timeit(lambda: embeddings.similarity("query", d), number=25) / 25
# 0.66s per call

GPU prices have really come down lately. A RTX 3060 is ~$500 and there are RTX 3090s out there for around $1,100. A year ago those were 2.5-3x more expensive.

A RTX 3060 has 3,584 CUDA cores with 12GB of memory and a RTX 3090 has 10,496 with 24GB of memory. The elapsed time per call would be much lower on either of those. Server class NVIDIA GPUs are typically Quadro, V100, A100.

1reaction
Xyphiuscommented, Aug 5, 2022

That does make quite the difference. Setting the limit to ten yields results around 4.3 seconds.

I will look into setting up something with respect to GPU processing. I greatly appreciate your help.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python 3.x - txtai ElasticSearch Similarity slow - Stack Overflow
In using txtai, I've noticed that it is abysmally slow. Requesting for one result and my response time is almost 10 seconds vs...
Read more >
Add semantic search to Elasticsearch - neuml/txtai - GitHub
txtai has a similarity function that works on lists of text. This method can be integrated with any external search service, such as...
Read more >
Similarity - txtai
Computes the similarity between query and list of text. Returns a list of (id, score) sorted by highest score, where id is the...
Read more >
Slow cosine similarity script - Elasticsearch - Elastic Discuss
Hi, in a query, I am executing a cosine similarity script. It takes multiple seconds, but the top command shows CPU and memory...
Read more >
Introducing txtai, AI-powered semantic search built ... - Medium
txtai builds sentence embeddings to perform similarity searches. txtai takes each text record entry, tokenizes it and builds an embeddings representation of ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found