Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Introduce `nodeFilter` `Predicate` to filter `Partitions`

See original GitHub issue

Bug Report

Current Behavior

When Lettuce is connected to a 2 node Redis Cluster (1 shard, 1 replica), and is configured for REPLICA_PREFERRED, and the replica ceases responding to TCP (such as a ungraceful hardware failure), Lettuce does not recover until the TCP retry counter expires for that connection (~926 seconds).

Input Code

(Assumes the hostname `redis` resolves to all nodes in the group)

import io.lettuce.core.cluster.*;
import io.lettuce.core.cluster.api.sync.*;
import io.lettuce.core.cluster.api.*;
import io.lettuce.core.*;
import java.util.concurrent.TimeUnit;
import java.time.Duration;
 
public class RedisExample {
    public static void main(String[] args) {
        RedisURI redisUri = RedisURI.Builder.redis("redis").build();
 
        RedisClusterClient clusterClient = RedisClusterClient.create(redisUri);
 
        ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions
           .builder()
           .enablePeriodicRefresh(60, TimeUnit.SECONDS)
           .enableAllAdaptiveRefreshTriggers()
           .dynamicRefreshSources(true)
           .closeStaleConnections(true)
           .build();
       TimeoutOptions  timeoutOptions = TimeoutOptions
          .builder()
          .timeoutCommands()
          .fixedTimeout(Duration.ofMillis(400))
          .build();
 
        SocketOptions socketOptions = SocketOptions
          .builder()
          .connectTimeout(500, TimeUnit.MILLISECONDS)
          .build();
 
        clusterClient.setOptions(ClusterClientOptions
            .builder()
            .autoReconnect(true)
            .socketOptions(socketOptions)
            .cancelCommandsOnReconnectFailure(true)
            .timeoutOptions(timeoutOptions)
            .disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
            .topologyRefreshOptions(topologyRefreshOptions)
            .validateClusterNodeMembership(true)
            .suspendReconnectOnProtocolFailure(true)
            .build());
 
        StatefulRedisClusterConnection<String, String> connection = clusterClient.connect();
        RedisAdvancedClusterCommands<String, String> syncCommands = connection.sync();
        connection.setReadFrom(ReadFrom.REPLICA_PREFERRED);
 
        String value1 = syncCommands.set("foo", "bar");
 
        while (true) {
         try {
          Thread.sleep(1000);
          String value = syncCommands.get("foo");
          if (! value.equals(value2)) {
            System.out.println("bad response:" + value + ":" + value2 + ":");
          } else {
            System.out.println("Good response: " + value);
          }
         } catch (Exception e) {
            System.out.println("Error response: " + e);
         }
       }
    }
}

Expected behavior/code

When the timeout is reached and a dynamic topology refresh is triggered, connections to the node in “fail?” state should be considered stale and closed / abandoned.

Environment

Lettuce version(s): 6.1.5.RELEASE
Redis server v=6.2.6 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=3f28004270edf9dc
OpenJDK Runtime Environment (build 1.8.0_312-b07)
Amazon Linux 2, 5.10.75-79.358.amzn2.x86_64, t3a.small

Possible Solution

A workaround, undesirable as it is global-to-the-node in nature, is to shorten the TCP retry counter on the client:

echo 5 >/proc/sys/net/ipv4/tcp_retries2

At first glance it looks like adding a filter for failed/eventual_fail nodes at https://github.com/lettuce-io/lettuce-core/blob/cda3be6b9477da790365ad098c6e39c8687f5002/src/main/java/io/lettuce/core/cluster/topology/DefaultClusterTopologyRefresh.java#L292-L296 would cap the duration of the failure scenario at the periodic-topology-refresh interval.

Additional context

TCPdump clearly shows the client getting a topology refresh occurs, but the client does not recover until the existing TCP connection exits.

17:58:54.396570 IP 10.0.0.41.6379 > 10.0.0.218.46894: Flags [P.], seq 150:428, ack 117, win 490, options [nop,nop,TS val 3343612451 ecr 121987982], length 278: RESP "=270" "txt:e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 10.0.0.41:6379@16379 myself,master - 0 0 0 connected 0-16383" "215c649d39c0182c82aec8fc7e533cd57c052b9a 10.0.0.101:6379@16379 slave,fail? e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 1640109483742 1640109482738 0 connected"

Failure of the replica node was simulated by dropping all Redis packets on the replica:

$ for x in INPUT OUTPUT; do for y in 6379 16379; do iptables -I $x -p tcp --dport $y -j DROP; done; done

Redis.conf contains:

port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
maxmemory 1gb

TCP KeepAlive’s do not help here as the connection is not idle.

Issue Analytics

State:
Created 2 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

jhmartincommented, Jan 4, 2022

I tested reworking https://github.com/lettuce-io/lettuce-core/blob/4110f2820766c4967951639aa2b6bdd9d50466be/src/main/java/io/lettuce/core/cluster/RedisClusterClient.java#L1025-L1032 to filter out FAIL and EVENTUAL_FAIL nodes and achieved the same recovery after the periodic-topology-refresh interval.

0reactions

jhmartincommented, Jan 7, 2022

Adding .nodeFilter(it -> ! (it.is(RedisClusterNode.NodeFlag.FAIL) || it.is(RedisClusterNode.NodeFlag.EVENTUAL_FAIL))) gained me the desired behavior.

Top Results From Across the Web

Managing Data Distribution | GridGain Documentation

The partitions of each cache are distributed across all server nodes in accordance ... To create a node filter, implement IgnitePredicate<ClusterNode> and ...

Run an AWS Glue job on a specific Amazon S3 partition

To filter on partitions in the AWS Glue Data Catalog, use a pushdown predicate. Unlike Filter transforms, pushdown predicates allow you to ...

java - Library method to partition a collection by a predicate

Use Guava's Multimaps.index . Here is an example, which partitions a list of words into two parts: those which have length > 3...

How to use async functions with Array.filter in Javascript

The filter function keeps only the elements that pass a condition. It gets a function, this time it's called a predicate, and this...

Filtering collections - Kotlin

In Kotlin, filtering conditions are defined by predicates ... function – partition() – filters a collection by a predicate and keeps the ...