question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Connection Issues with Redis

See original GitHub issue

I have a celery instance running as a worker process on heroku with heroku redis. About every minute (the amount of time after which heroku redis kills idle connections) this comes up in my logs:

Dec 12 14:34:40 app/worker.1:  [2015-12-12 16:34:40,387: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection... 
Dec 12 14:34:40 app/worker.1:  Traceback (most recent call last): 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/celery/worker/consumer.py", line 278, in start 
Dec 12 14:34:40 app/worker.1:      blueprint.start(self) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/celery/bootsteps.py", line 123, in start 
Dec 12 14:34:40 app/worker.1:      step.start(parent) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/celery/worker/consumer.py", line 821, in start 
Dec 12 14:34:40 app/worker.1:      c.loop(*c.loop_args()) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/celery/worker/loops.py", line 76, in asynloop 
Dec 12 14:34:40 app/worker.1:      next(loop) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/async/hub.py", line 285, in create_loop 
Dec 12 14:34:40 app/worker.1:      poll_timeout = fire_timers(propagate=propagate) if scheduled else 1 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/async/hub.py", line 144, in fire_timers 
Dec 12 14:34:40 app/worker.1:      entry() 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/async/timer.py", line 64, in __call__ 
Dec 12 14:34:40 app/worker.1:      return self.fun(*self.args, **self.kwargs) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/async/timer.py", line 132, in _reschedules 
Dec 12 14:34:40 app/worker.1:      return fun(*args, **kwargs) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 322, in maybe_restore_messages 
Dec 12 14:34:40 app/worker.1:      num=channel.unacked_restore_limit, 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 188, in restore_visible 
Dec 12 14:34:40 app/worker.1:      self.unacked_mutex_expire): 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/contextlib.py", line 17, in __enter__ 
Dec 12 14:34:40 app/worker.1:      return self.gen.next() 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 112, in Mutex 
Dec 12 14:34:40 app/worker.1:      i_won = client.setnx(name, lock_id) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/redis/client.py", line 1097, in setnx 
Dec 12 14:34:40 app/worker.1:      return self.execute_command('SETNX', name, value) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/redis/client.py", line 575, in execute_command 
Dec 12 14:34:40 app/worker.1:      connection.disconnect() 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 819, in disconnect 
Dec 12 14:34:40 app/worker.1:      channel._on_connection_disconnect(self) 
Dec 12 14:34:40 app/worker.1:    File "/app/.heroku/python/lib/python2.7/site-packages/kombu/transport/redis.py", line 476, in _on_connection_disconnect 
Dec 12 14:34:40 app/worker.1:      raise get_redis_ConnectionError() 
Dec 12 14:34:40 app/worker.1:  ConnectionError 
Dec 12 14:34:40 app/worker.1:  [2015-12-12 16:34:40,452: INFO/MainProcess] Connected to redis://h:**@***************:********// 
Dec 12 14:34:40 app/worker.1:  [2015-12-12 16:34:40,480: INFO/MainProcess] mingle: searching for neighbors 
Dec 12 14:34:41 app/worker.1:  [2015-12-12 16:34:41,535: INFO/MainProcess] mingle: all alone

This is a recent phenomenon, it started last night and isn’t foiled by a process restart. Could this possibly be a result of any updates or the like?

Running python 2.7.11 amqp==1.4.8 anyjson==0.3.3 billiard==3.3.0.22 celery==3.1.19 kombu==3.0.30 pytz==2015.7 redis==2.10.5 Django==1.8.7 django-redis-cache==1.6.5

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:64 (12 by maintainers)

github_iconTop GitHub Comments

2reactions
skoczencommented, May 11, 2016

Just tossing this here, in case it helps someone who gets here from google, as I did. Lots of errors that look like the above bugs are cropping up right now (have been for ~the past 3 days) if you’re using RedisToGo on heroku. Some of the workarounds help (.29 works, for instance, but has the UUID problems.), but don’t solve it.

Actual Fix: RedisToGo is the problem. Switching to another redis provider (I went to heroku redis) will fix it.

Despite the pulling of your hair, if you’re on RedisToGo, it’s likely not actually a problem with kombu.

2reactions
RafaAguilarcommented, Apr 27, 2016

Same issue here with:

kombu==3.0.35 celery==3.1.23 redis==2.10.5

 . . .
  File "path/venv/lib/python3.4/site-packages/kombu/transport/redis.py", line 498, in _on_connection_disconnect
    raise get_redis_ConnectionError()
redis.exceptions.ConnectionError
Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting Redis
We have a long history of users experiencing crashes with Redis that actually turned out to be servers with broken RAM. Please test...
Read more >
Troubleshooting Redis Connection Failures - 华为云
The connection fails when you use redis-cli to connect to a Redis Cluster instance. Solution: Check whether -c is added to the connection...
Read more >
Troubleshoot connecting to an ElastiCache for Redis cluster
Verify that the cluster is ready · Verify that the cluster is healthy · Verify network-level connectivity between the cluster and the client ......
Read more >
Could not connect to redis connection refused - Fix it easily
The most common reason for the connection refused error is that the Redis-Server is not started. Redis server should be started to use...
Read more >
Troubleshoot connectivity in Azure Cache for Redis
Learn how to resolve connectivity problems when creating clients with Azure Cache for Redis.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found