question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Workers stop processing jobs after Redis reconnect

See original GitHub issue

In production we’re using Amazon Elasticache with BullMQ^1.34.2

We’re finding that in the event of a failover the following error is emitted by the workers UNBLOCKED force unblock from blocking operation, instance state changed (master -> replica?) and workers stop processing jobs. But jobs are still able to be queued.

Currently we have to redeploy our app to rectify this issue. Is there anything we can do to handle this error so that when Redis reconnects it can start processing jobs again? Thanks.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
manastcommented, Jul 20, 2021

I do not see any issue with the naked eye, it should work.

1reaction
manastcommented, Jul 20, 2021

Yeah, I think i know why this happens. There is a loop inside BullMQ that throws an exception in this case and stops looping. We have a fix in older Bull that I can port to BullMQ that should resolve the issue though.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Jobs get stuck after Redis reconnect #1873 - OptimalBits/bull
If the worker has processed the job and you can verify that the job is indeed completed, then not calling the complete handler...
Read more >
Celery not executing new tasks if redis lost connection is ...
Well task are being processed but only stop executing new task if redis is down then up again. When I start my celery...
Read more >
OptimalBits/bull - Gitter
Im trying to iterate over all queues jobs(around 2M), but for some reason Im getting null after the first page values. Maybe Im...
Read more >
Class: Resque::Worker - RubyDoc.info
Stop processing jobs after the current one has completed (if we're currently running ... Reconnect to Redis to avoid sharing a connection with...
Read more >
Celery workers stop fetching new task after few hours of ...
We've experienced the problem again yesterday. Sudden rise of messages count caused celery to stop fetching tasks (we're still on redis). So upgrading...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found