question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rerun benchmark with elasticsearch 7.5 or above

See original GitHub issue

In ES 7.5, we made some improvements to the performance of Elasticsearch dense_vector operations (https://github.com/elastic/elasticsearch/pull/46294). Although I still expect the QPS to be significantly worse than Vespa’s, it would be helpful to rerun the benchmarks against ES 7.5 to get an up-to-date comparison.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:14 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
jobergumcommented, Mar 23, 2020

@jtibshirani the vector is not returned with the result, if that was the case yes - I would have spotted it.

Sample response from ES

{"took":604,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":10000,"relation":"gte"},"max_score":0.005666477,"hits":[{"_index":"doc","_type":"_doc","_id":"669835","_score":0.005666477},{"_index":"doc","_type":"_doc","_id":"408764","_score":0.0056393184},{"_index":"doc","_type":"_doc","_id":"408462","_score":0.0054252045},{"_index":"doc","_type":"_doc","_id":"408855","_score":0.0053858217},{"_index":"doc","_type":"_doc","_id":"551661","_score":0.0053397696},{"_index":"doc","_type":"_doc","_id":"861882","_score":0.005264404},{"_index":"doc","_type":"_doc","_id":"406273","_score":0.0052393572},{"_index":"doc","_type":"_doc","_id":"406324","_score":0.0052266084},{"_index":"doc","_type":"_doc","_id":"551743","_score":0.005219447},{"_index":"doc","_type":"_doc","_id":"861530","_score":0.0052178036}]}}

On cpu architectures, yes it’s explained by us using avx512 instructions See

Will soon update with results using our HNSW implementation for approximate nearest neighbor search, some sample data with gist data set:

image

1reaction
jtibshiranicommented, Mar 23, 2020

@jobergum I’m sorry for the late reply. I’m not sure why your benchmarking results aren’t lining up with @mayya-sharipova’s. The only other difference that comes to mind is that we always make sure to omit the returning the full document source in results by setting _source: false in the search request body: https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-request-body.html#request-body-search-source-filtering. Otherwise ES will load and return the whole stored vector for the top 10 results, whereas we are just interested in the document IDs.

@jtibshirani I’ve updated the master branch using 7.6.

Thanks! The ‘Ivy Bridge’ numbers make sense to me, based on the previous results and the performance improvements in ES. However the Haswell numbers are more surprising – do you know why Vespa shows a latency improvement of ~2x between the Ivy Bridge and Haswell processors?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Benchmarking and sizing your Elasticsearch cluster for logs ...
In this post, we'll tackle performance Elasticsearch benchmarking and sizing questions like the above. We'll go beyond “it depends” to equip ...
Read more >
Rally Documentation
You want to benchmark Elasticsearch? Then Rally is for you. It can help you with the following tasks: • Setup and teardown of...
Read more >
Elasticsearch: Adventures in scaling a multitenant platform
The original plan was to either benchmark from the existing production cluster or to use our staging platform. However, not wanting to impact...
Read more >
A benchmark-based evaluation of search-based crash ...
To that end, we devise a new benchmark of real-world crashes, ... 36.9% for Defects4J, to 7.5% for XWiki, and only 3% for...
Read more >
Vdbench performance test on raw device - FlamingBytes
Master and Slave: Vdbench runs as two or more Java Virtual Machines (JVMs). ... The parameters include General, Host Definition (HD), Replay ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found