Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RedisTimeoutException after long time of running

See original GitHub issue

Expected behavior

Redisson must reconnect to the server if the connection is lost.

Actual behavior

It seems to didn’t even try to reconnect.

Steps to reproduce or test case

We are using AWS as the Redisson hosts, and OVH as the Redis server host (with a ping between 10 to 30 ms max from one server to the other one).
Let the server run for some hours (3hours seems to be enough), then try to do a request to the server.
You’ll get this kind of error:

org.redisson.client.RedisTimeoutException: Unable to send command! Node source: NodeSource [slot=null, addr=null, redisClient=null, redirect=null, entry=MasterSlaveEntry [masterEntry=[freeSubscribeConnectionsAmount=0, freeSubscribeConnectionsCounter=5, freeConnectionsAmount=0, freeConnectionsCounter=4, freezed=false, freezeReason=null, client=[addr=redis://xxxx.xxxx.xxxx.xxxx:6380], nodeType=MASTER, firstFail=0]]], connection: RedisConnection@2119571203 [redisClient=[addr=redis://xxxx.xxxx.xxxx.xxxx:6380], channel=[id: 0xf5e8192b, L:/xxxx.xxxx.xxxx.xxxx:36546 - R:xxxx.xxxx.xxxx.xxxxx/xxxx.xxxx.xxxx.xxxx:6380]], command: (EVAL), command params: [while true do local firstThreadId2 = redis.call('lindex', KEYS[2], 0);if firstThreadId2 == false the..., 3, XXXX:6360:lock, redisson_lock_queue:{XXXX:6360:lock}, redisson_lock_timeout:{XXXX:6360:lock}, 30000, 42785f4b-af2a-4002-aef2-940e5657ac26:57, 1548175178331, 1548175173331] after 3 retry attempts
	at org.redisson.command.CommandAsyncService$10.run(CommandAsyncService.java:721)
	at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:668)
	at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:743)
	at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:471)
	at java.lang.Thread.run(Thread.java:748)

Redis version

Waiting for my Sysadmin to come back to tell me our Redis version.

Redisson version

3.10.0

Redisson configuration

{
   "clusterServersConfig":{
      "idleConnectionTimeout":10000,
      "pingTimeout":1000,
      "connectTimeout":10000,
      "timeout":3000,
      "retryAttempts":3,
      "retryInterval":1500,
      "failedSlaveReconnectionInterval":3000,
      "failedSlaveCheckInterval":60000,
      "password":"REDIS_PROD",
      "subscriptionsPerConnection":5,
      "clientName":null,
      "loadBalancer":{
         "class":"org.redisson.connection.balancer.RoundRobinLoadBalancer"
      },
      "subscriptionConnectionMinimumIdleSize":1,
      "subscriptionConnectionPoolSize":5,
      "slaveConnectionMinimumIdleSize":1,
      "slaveConnectionPoolSize":5,
      "masterConnectionMinimumIdleSize":1,
      "masterConnectionPoolSize":5,
      "readMode":"SLAVE",
      "subscriptionMode":"SLAVE",
      "nodeAddresses":[
        "redis://XXXXXX:6379"
      ],
      "scanInterval":1000
   },
   "threads":0,
   "nettyThreads":0,
   "codec":{
      "class":"org.redisson.codec.JsonJacksonCodec"
   },
   "transportMode":"NIO"
}

Issue Analytics

State:
Created 5 years ago
Comments:59 (24 by maintainers)

Top GitHub Comments

2reactions

jackyguruicommented, Jan 24, 2019

Please correct me if I am wrong, it feels to me you have lots of Locks/FairLocks running concurrently. In Redisson, Each active lock consumes a lock resource (calculated as subscription pool size X topics per connection size). Based on the stacktrace information, it seems to me the problem means Redisson is unable to obtain a lock resource under certain time (retry interval X retry attempts). Can you roughly gauge how many concurrent locks you will be holding and adjust the configuration accordingly?

1reaction

kaneseecommented, Oct 21, 2022

I know this is old, but I had a similar issue. In our case, our managed Redis server on Azure has a 10min timeout for idle connections. That seems to line up with OP’s experience of disconnects at 10min. The redis client should try to reconnect so I’m not sure why it’s not doing that in their version.

We were using version 3.10.4, which apparently had pingConnectionInterval defaulted to 0. Turning this on (e.g. 30s) solved our issue. In later versions of redis client, this setting is defaulted to 30s