Introduce `nodeFilter` `Predicate` to filter `Partitions`
See original GitHub issueBug Report
Current Behavior
When Lettuce is connected to a 2 node Redis Cluster (1 shard, 1 replica), and is configured for REPLICA_PREFERRED, and the replica ceases responding to TCP (such as a ungraceful hardware failure), Lettuce does not recover until the TCP retry counter expires for that connection (~926 seconds).
Input Code
Input Code
(Assumes the hostname `redis` resolves to all nodes in the group)import io.lettuce.core.cluster.*;
import io.lettuce.core.cluster.api.sync.*;
import io.lettuce.core.cluster.api.*;
import io.lettuce.core.*;
import java.util.concurrent.TimeUnit;
import java.time.Duration;
public class RedisExample {
public static void main(String[] args) {
RedisURI redisUri = RedisURI.Builder.redis("redis").build();
RedisClusterClient clusterClient = RedisClusterClient.create(redisUri);
ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions
.builder()
.enablePeriodicRefresh(60, TimeUnit.SECONDS)
.enableAllAdaptiveRefreshTriggers()
.dynamicRefreshSources(true)
.closeStaleConnections(true)
.build();
TimeoutOptions timeoutOptions = TimeoutOptions
.builder()
.timeoutCommands()
.fixedTimeout(Duration.ofMillis(400))
.build();
SocketOptions socketOptions = SocketOptions
.builder()
.connectTimeout(500, TimeUnit.MILLISECONDS)
.build();
clusterClient.setOptions(ClusterClientOptions
.builder()
.autoReconnect(true)
.socketOptions(socketOptions)
.cancelCommandsOnReconnectFailure(true)
.timeoutOptions(timeoutOptions)
.disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
.topologyRefreshOptions(topologyRefreshOptions)
.validateClusterNodeMembership(true)
.suspendReconnectOnProtocolFailure(true)
.build());
StatefulRedisClusterConnection<String, String> connection = clusterClient.connect();
RedisAdvancedClusterCommands<String, String> syncCommands = connection.sync();
connection.setReadFrom(ReadFrom.REPLICA_PREFERRED);
String value1 = syncCommands.set("foo", "bar");
while (true) {
try {
Thread.sleep(1000);
String value = syncCommands.get("foo");
if (! value.equals(value2)) {
System.out.println("bad response:" + value + ":" + value2 + ":");
} else {
System.out.println("Good response: " + value);
}
} catch (Exception e) {
System.out.println("Error response: " + e);
}
}
}
}
Expected behavior/code
When the timeout is reached and a dynamic topology refresh is triggered, connections to the node in “fail?” state should be considered stale and closed / abandoned.
Environment
- Lettuce version(s): 6.1.5.RELEASE
- Redis server v=6.2.6 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=3f28004270edf9dc
- OpenJDK Runtime Environment (build 1.8.0_312-b07)
- Amazon Linux 2, 5.10.75-79.358.amzn2.x86_64, t3a.small
Possible Solution
A workaround, undesirable as it is global-to-the-node in nature, is to shorten the TCP retry counter on the client:
echo 5 >/proc/sys/net/ipv4/tcp_retries2
At first glance it looks like adding a filter for failed/eventual_fail nodes at https://github.com/lettuce-io/lettuce-core/blob/cda3be6b9477da790365ad098c6e39c8687f5002/src/main/java/io/lettuce/core/cluster/topology/DefaultClusterTopologyRefresh.java#L292-L296 would cap the duration of the failure scenario at the periodic-topology-refresh interval.
Additional context
TCPdump clearly shows the client getting a topology refresh occurs, but the client does not recover until the existing TCP connection exits.
17:58:54.396570 IP 10.0.0.41.6379 > 10.0.0.218.46894: Flags [P.], seq 150:428, ack 117, win 490, options [nop,nop,TS val 3343612451 ecr 121987982], length 278: RESP "=270" "txt:e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 10.0.0.41:6379@16379 myself,master - 0 0 0 connected 0-16383" "215c649d39c0182c82aec8fc7e533cd57c052b9a 10.0.0.101:6379@16379 slave,fail? e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 1640109483742 1640109482738 0 connected"
Failure of the replica node was simulated by dropping all Redis packets on the replica:
$ for x in INPUT OUTPUT; do for y in 6379 16379; do iptables -I $x -p tcp --dport $y -j DROP; done; done
Redis.conf contains:
port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
maxmemory 1gb
TCP KeepAlive’s do not help here as the connection is not idle.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
I tested reworking https://github.com/lettuce-io/lettuce-core/blob/4110f2820766c4967951639aa2b6bdd9d50466be/src/main/java/io/lettuce/core/cluster/RedisClusterClient.java#L1025-L1032 to filter out FAIL and EVENTUAL_FAIL nodes and achieved the same recovery after the periodic-topology-refresh interval.
Adding
.nodeFilter(it -> ! (it.is(RedisClusterNode.NodeFlag.FAIL) || it.is(RedisClusterNode.NodeFlag.EVENTUAL_FAIL)))
gained me the desired behavior.