Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reindex performance degrading logarithmically

See original GitHub issue

Hi! We’re re-indexing a 7GB index, and noticed that performance starts out fast then logarithmically degrades over time:

screen shot 2016-04-26 at 9 57 50 am

We’re using elasticsearch v1.7.3 and elasticsearch-py v1.9.0.

We’re following all recommendations for increasing indexing performance, eg:

index.refresh_interval: -1
index.store.throttle.type: none
index.translog.flush_threshold_size: 1g
index.number_of_replicas: 0

Our cluster is at AWS and is comprised of the following:

(5) m4.xlarge data nodes
(3) m3.medium master nodes
(1) m4.large client node

This cluster should be plenty beefy for indexing a paltry 7GB of data. The original indexing only took a couple of hours to complete, but this re-indexing has been going for nearly 24 hours and is only 70% done. And it only seems to be getting slower as time goes on. At this rate, the re-index will never finish.

We’ve tried various chunk sizes in the reindex() call, but it doesn’t seem to affect performance so we’re using the default of 500.

The python script is relatively chill, while ES is blasting away at the CPU:

screen shot 2016-04-26 at 10 25 58 am

Any ideas on what would cause this behavior? And how to get past it? I’m suspecting that there’s an issue with scan/scroll. It’s almost like the client needs to seek through all the previous chunks to get to the next chunk, so everything is getting slower the further it gets. But that’s just a wild guess.

Fixing this is essential for completing our upgrade to ES 2.3, especially since we have indices that are 10x the size of this 7GB index that we will need to be reindexing as well. Thanks!

Issue Analytics

State:
Created 7 years ago
Comments:15 (2 by maintainers)

Top GitHub Comments

2reactions

jsnodcommented, May 13, 2016

This was solved in https://github.com/elastic/elasticsearch/issues/18253

Basically I needed to explicitly set a much higher size value in the scan_kwargs argument to reindex(). The default of 10 is way to low for reindex operations.

I will open another ticket suggesting a higher default value for reindex(), or at least updating the documentation to explain why.

0reactions

karmicommented, May 10, 2016

Hi @jsnod, the best way to contact the Elasticsearch core team with a technical issue like this one (a reproducible problem) is to open an issue at the Github repository: https://github.com/elastic/elasticsearch