Elasticsearch IndexerBolt not being acked correctly causing failures in spout
See original GitHub issueWe found with @jcruzmartini that elasticsearch Indexer is bolt acking before emit tuples in afterBulk method is causing ack failures in spout after timeout set in topology.
Proposed solution is change order of emit / ack in com.digitalpebble.stormcrawler.elasticsearch.bolt.IndexerBolt :
if (!failed) {
acked++;
_collector.emit(StatusStreamName, t, new Values(u,
metadata, Status.FETCHED));
_collector.ack(t);
} else {
...
...
After migrate from 1.13 to 1.16 we noticed bad performance in our crawler, and also a lot of failures in the spout, after add IndexerBolt class in our project with that modification it started working correctly with great performance
@jnioche we can create a pull request with a simple change in that class if you want
Thanks! Matias
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Fix common cluster issues | Elasticsearch Guide [8.5] | Elastic
The most common causes of high CPU usage and their solutions. High JVM memory pressure: High JVM memory usage can degrade cluster performance...
Read more >Troubleshooting Elasticsearch ILM: Common issues and fixes
If the policy configuration is correct and no errors are reported but your action isn't progressing, you'll need to investigate if it's waiting ......
Read more >Fix common cluster issues | Elasticsearch Guide [7.17] | Elastic
The following tips outline the most common causes of high CPU usage and their solutions. Scale your cluster. Heavy indexing and search loads...
Read more >Elasticsearch Resiliency Status | Elastic
This issue exposed a bug in Elasticsearch's handling of primary shard failure when having more than 2 replicas, causing the second replica to...
Read more >Troubleshooting searches | Elasticsearch Guide [8.5] | Elastic
When getting no search results in Kibana, check that you have selected the correct data view and a valid time range. Also, ensure...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
changed the title a bit as it is not about the anchoring as such
@jcruzmartini it wasn’t an easy one to spot, but you and @matiascrespof have great detective skills 😉
Thanks again for reporting it and submitting a PR. I’ll go through all the acks to see if this happens anywhere else
Fixed by #801