Avoid or reduce connection attempts to nodes in fail state
See original GitHub issueHi,
Lettuce 4.3.0.Final We have redis-cluster with 3 nodes. Each node has 10 processes. Processes on ports 9000…9004 are master, 9005…9009 are slaves. So, we have 15 masters and 15 slaves.
Lettuce was configured to use ReadFromSlave read implementation. I figured out that it’s dangerous option and it leads to proposal https://github.com/mp911de/lettuce/issues/452
Additionally, lettuce configured to use pretty small request timeout (5ms) and large connection timeout (like 1s).
We lost one node of the cluster. ReadFromSlave forces reads from slaves, which are not available. It leads to exceptions provided in the gist. In PartitionsException.txt file there you can find full partitions state.
Problem: A lot of threads stuck trying to setup connection in block manner to nodes which
- already have a fail state from Redis logs and lettuce aware of it
- we constantly get ConnectTimeoutException trying to connect to it
So, lettuce continues to allow many of incoming threads calling get()
try to establish a connection to a dead node with the big connection timeout in block manner.
This led to server dead in our case.
Proposal:
If lettuce sees that node has fail
state from Redis cluster + node is actually unavailable from prev attempts - don’t allow a lot of threads to establish new connections and block on it. Quickly throw “RedisException: Cannot determine a partition to read” in nearly all such cases. For example, allow only one thread at a time to establish a connection to a node which is dead currently.
For now, I see about one quick RedisException: connection timed out
for one long blocking ConnectTimeoutException
.
WDYT?
Issue Analytics
- State:
- Created 7 years ago
- Comments:14 (7 by maintainers)
Top GitHub Comments
Removing the failed future must happen after completion exactly once. I think it’s a synchronization issue. Ok, then we’ve found a viable solution.
Let’s keep the quiet time after connection failure separate. That’s something which can be built on top of
RedisClusterClient
that requires slight adjustments to the visibility ofconnectStateful
andconnectStatefulAsync
methods. This way you can keep track of failed connections per host and cache the resulting future.ClusterNodeConnectionFactory.getOrCreateConnection(…)
is perConnectionKey
which also incorporates the connection intent (read/write) but in your case you want to group connections by host/port.Good catch, I created #460 for the mentioned bug. Thanks @Spikhalskiy for digging into the issue.
I think there are many approaches that would work. I like the approach to synchronize with
CompletableFuture
because it follows non-blocking connection initialization. I see two things here:These are different problems but still related. For now, I’d like to solve your issue to reduce multiple connection attempts to the same
ConnectionKey
to at most one.I think the change isn’t huge:
ClusterNodeConnectionFactory
would return aCompletableFuture<StatefulRedisConnection<K, V>>
so connection happens asynchronously. 2.ConcurrentHashMap
synchronizes onConnectionKey
to guarantee only one connection attemptgetOrCreateConnection()
returns early with a future that is used to synchronizeConnectionKey
exactly once from the map and propagate the connection exception.The
ConcurrentHashMap
should be encapsulated with an own type to make the underlying concept more clear.Does this make sense?