Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Workers stop processing jobs after Redis reconnect

See original GitHub issue

In production we’re using Amazon Elasticache with BullMQ^1.34.2

We’re finding that in the event of a failover the following error is emitted by the workers UNBLOCKED force unblock from blocking operation, instance state changed (master -> replica?) and workers stop processing jobs. But jobs are still able to be queued.

Currently we have to redeploy our app to rectify this issue. Is there anything we can do to handle this error so that when Redis reconnects it can start processing jobs again? Thanks.

Issue Analytics

State:
Created 2 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

manastcommented, Jul 20, 2021

I do not see any issue with the naked eye, it should work.

1reaction

manastcommented, Jul 20, 2021

Yeah, I think i know why this happens. There is a loop inside BullMQ that throws an exception in this case and stops looping. We have a fix in older Bull that I can port to BullMQ that should resolve the issue though.

Top Results From Across the Web

Jobs get stuck after Redis reconnect #1873 - OptimalBits/bull

If the worker has processed the job and you can verify that the job is indeed completed, then not calling the complete handler...

Celery not executing new tasks if redis lost connection is ...

Well task are being processed but only stop executing new task if redis is down then up again. When I start my celery...

OptimalBits/bull - Gitter

Im trying to iterate over all queues jobs(around 2M), but for some reason Im getting null after the first page values. Maybe Im...

Class: Resque::Worker - RubyDoc.info

Stop processing jobs after the current one has completed (if we're currently running ... Reconnect to Redis to avoid sharing a connection with...

Celery workers stop fetching new task after few hours of ...

We've experienced the problem again yesterday. Sudden rise of messages count caused celery to stop fetching tasks (we're still on redis). So upgrading...