question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RuntimeError: pubsub connection not set.

See original GitHub issue

Version: redis-py=3.1.0 and redis=3.2.10

Platform: Python 2.7.5 / CentOS Linux release 7.4.1708 (Core)

Description:

Infrastructure

  • two machines (worker1, worker2 ) for running celery worker services with default concurrency (=8).
  • one dedicated machine (redis1) for running redis server.

**** Issue **** After the workers running for some time, suddenly a worker running on machine1 dies due to a RuntimeError raised losing a connection to pubsub.

machine1.worker.log

[2019-02-01 13:43:39,477: CRITICAL/MainProcess] Unrecoverable error: RuntimeError(u'pubsub connection not set: did you forget to call subscribe() or psubscribe()?',)
Traceback (most recent call last):
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/worker.py", line 205, in start
    self.blueprint.start(self)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 369, in start
    return self.obj.start()
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 322, in start
    blueprint.start(self)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 598, in start
    c.loop(*c.loop_args())
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/loops.py", line 91, in asynloop
    next(loop)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/asynchronous/hub.py", line 354, in create_loop
    cb(*cbargs)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 1047, in on_readable
    self.cycle.on_readable(fileno)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 344, in on_readable
    chan.handlers[type]()
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 674, in _receive
    ret.append(self._receive_one(c))
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 685, in _receive_one
    response = c.parse_response()
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 3032, in parse_response
    'pubsub connection not set: '
RuntimeError: pubsub connection not set: did you forget to call subscribe() or psubscribe()?

While at the same time I have spotted that worker running on machine2 suffers due to not being able to connect to redis. Eventually, it managed to recover and reconnect to redis and receiving the queued tasks.

machine2.worker.log

[2019-02-01 14:43:41,722: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/consumer.py", line 322, in start
    blueprint.start(self)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/bootsteps.py", line 119, in start
    step.start(parent)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 40, in start
    self.sync(c)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 44, in sync
    replies = self.send_hello(c)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/worker/consumer/mingle.py", line 57, in send_hello
    replies = inspect.hello(c.hostname, our_revoked._data) or {}
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 143, in hello
    return self._request('hello', from_node=from_node, revoked=revoked)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 95, in _request
    timeout=self.timeout, reply=True,
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/celery/app/control.py", line 454, in broadcast
    limit, callback, channel=channel,
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/pidbox.py", line 315, in _broadcast
    serializer=serializer)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/pidbox.py", line 290, in _publish
    serializer=serializer,
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/messaging.py", line 203, in _publish
    mandatory=mandatory, immediate=immediate,
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/virtual/base.py", line 605, in basic_publish
    message, exchange, routing_key, **kwargs
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/virtual/exchange.py", line 151, in deliver
    exchange, message, routing_key, **kwargs)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/kombu/transport/redis.py", line 781, in _put_fanout
    dumps(message),
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 2716, in publish
    return self.execute_command('PUBLISH', channel, message)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 775, in execute_command
    return self.parse_response(connection, command_name, **options)
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/client.py", line 789, in parse_response
    response = connection.read_response()
  File "/opt/c1/cip-middleware/webapp/virtualenv/lib/python2.7/site-packages/redis/connection.py", line 636, in read_response
    raise e
ConnectionError: Error while reading from socket: (u'Connection closed by server.',)
[2019-02-01 14:44:50,237: INFO/MainProcess] Received task: c1_cip_middleware.tasks.validate_purchases.run_purchases_validation[a411dd90-ab50-4101-becb-90adda3663a2]

*** Questions ***

I wonder what are the circumstances/scenarios when RuntimeError is raised, thus, the worker gets into the “unrecovery” stage and must be stopped?

I am in doubt what could be a root-cause of having this issue, especially that one worker managed to recover but the other one just died?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:2
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
stitchcommented, Oct 7, 2020

Just experienced this in redis-py 3.5.3, causing a worker to crash.

The suggested option health_check_interval does not exist according to documentation: https://docs.celeryproject.org/en/stable/userguide/configuration.html

According to the changes from that day there are some possible suspects: We’ve been experiencing this after switching to gevent (because of serialization issues with django, https://github.com/celery/celery/issues/5924)

And changing the celery config to be more resilient to outages (coming from no resilience): CELERY_BROKER_CONNECTION_RETRY = True CELERY_BROKER_CONNECTION_MAX_RETRIES = 400

0reactions
github-actions[bot]commented, Aug 31, 2022

This issue is marked stale. It will be closed in 30 days if it is not updated.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RuntimeError: pubsub connection not set. · Issue #1132 - GitHub
After the workers running for some time, suddenly a worker running on machine1 dies due to a RuntimeError raised losing a connection to...
Read more >
Python Redis - RuntimeError pubsub connection not set
After the workers running for some time, suddenly a worker running on machine1 dies due to a RuntimeError raised losing a connection to...
Read more >
Source code for redis.client - redis-py's documentation!
It is not safe to pass PubSub or Pipeline objects between threads. ... if conn is None: raise RuntimeError( "pubsub connection not set:...
Read more >
redis/asyncio/client.py - Fossies
... the response from a publish/subscribe command""" 771 conn = self.connection 772 if conn is None: 773 raise RuntimeError( 774 "pubsub connection not...
Read more >
Troubleshooting | Cloud Pub/Sub Documentation
Learn about troubleshooting steps that you might find helpful if you run into problems using Pub/Sub. Cannot create a subscription.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found