question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Introduce `nodeFilter` `Predicate` to filter `Partitions`

See original GitHub issue

Bug Report

Current Behavior

When Lettuce is connected to a 2 node Redis Cluster (1 shard, 1 replica), and is configured for REPLICA_PREFERRED, and the replica ceases responding to TCP (such as a ungraceful hardware failure), Lettuce does not recover until the TCP retry counter expires for that connection (~926 seconds).

Input Code

Input Code (Assumes the hostname `redis` resolves to all nodes in the group)
import io.lettuce.core.cluster.*;
import io.lettuce.core.cluster.api.sync.*;
import io.lettuce.core.cluster.api.*;
import io.lettuce.core.*;
import java.util.concurrent.TimeUnit;
import java.time.Duration;
 
public class RedisExample {
    public static void main(String[] args) {
        RedisURI redisUri = RedisURI.Builder.redis("redis").build();
 
        RedisClusterClient clusterClient = RedisClusterClient.create(redisUri);
 
        ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions
           .builder()
           .enablePeriodicRefresh(60, TimeUnit.SECONDS)
           .enableAllAdaptiveRefreshTriggers()
           .dynamicRefreshSources(true)
           .closeStaleConnections(true)
           .build();
       TimeoutOptions  timeoutOptions = TimeoutOptions
          .builder()
          .timeoutCommands()
          .fixedTimeout(Duration.ofMillis(400))
          .build();
 
        SocketOptions socketOptions = SocketOptions
          .builder()
          .connectTimeout(500, TimeUnit.MILLISECONDS)
          .build();
 
        clusterClient.setOptions(ClusterClientOptions
            .builder()
            .autoReconnect(true)
            .socketOptions(socketOptions)
            .cancelCommandsOnReconnectFailure(true)
            .timeoutOptions(timeoutOptions)
            .disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
            .topologyRefreshOptions(topologyRefreshOptions)
            .validateClusterNodeMembership(true)
            .suspendReconnectOnProtocolFailure(true)
            .build());
 
        StatefulRedisClusterConnection<String, String> connection = clusterClient.connect();
        RedisAdvancedClusterCommands<String, String> syncCommands = connection.sync();
        connection.setReadFrom(ReadFrom.REPLICA_PREFERRED);
 
        String value1 = syncCommands.set("foo", "bar");
 
        while (true) {
         try {
          Thread.sleep(1000);
          String value = syncCommands.get("foo");
          if (! value.equals(value2)) {
            System.out.println("bad response:" + value + ":" + value2 + ":");
          } else {
            System.out.println("Good response: " + value);
          }
         } catch (Exception e) {
            System.out.println("Error response: " + e);
         }
       }
    }
}

Expected behavior/code

When the timeout is reached and a dynamic topology refresh is triggered, connections to the node in “fail?” state should be considered stale and closed / abandoned.

Environment

  • Lettuce version(s): 6.1.5.RELEASE
  • Redis server v=6.2.6 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=3f28004270edf9dc
  • OpenJDK Runtime Environment (build 1.8.0_312-b07)
  • Amazon Linux 2, 5.10.75-79.358.amzn2.x86_64, t3a.small

Possible Solution

A workaround, undesirable as it is global-to-the-node in nature, is to shorten the TCP retry counter on the client:

echo 5 >/proc/sys/net/ipv4/tcp_retries2

At first glance it looks like adding a filter for failed/eventual_fail nodes at https://github.com/lettuce-io/lettuce-core/blob/cda3be6b9477da790365ad098c6e39c8687f5002/src/main/java/io/lettuce/core/cluster/topology/DefaultClusterTopologyRefresh.java#L292-L296 would cap the duration of the failure scenario at the periodic-topology-refresh interval.

Additional context

TCPdump clearly shows the client getting a topology refresh occurs, but the client does not recover until the existing TCP connection exits.

17:58:54.396570 IP 10.0.0.41.6379 > 10.0.0.218.46894: Flags [P.], seq 150:428, ack 117, win 490, options [nop,nop,TS val 3343612451 ecr 121987982], length 278: RESP "=270" "txt:e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 10.0.0.41:6379@16379 myself,master - 0 0 0 connected 0-16383" "215c649d39c0182c82aec8fc7e533cd57c052b9a 10.0.0.101:6379@16379 slave,fail? e03b0b3b56ca33dc759fb6a122a903c7ac47d8f7 1640109483742 1640109482738 0 connected"

Failure of the replica node was simulated by dropping all Redis packets on the replica:

$ for x in INPUT OUTPUT; do for y in 6379 16379; do iptables -I $x -p tcp --dport $y -j DROP; done; done

Redis.conf contains:

port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
maxmemory 1gb

TCP KeepAlive’s do not help here as the connection is not idle.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
jhmartincommented, Jan 4, 2022

I tested reworking https://github.com/lettuce-io/lettuce-core/blob/4110f2820766c4967951639aa2b6bdd9d50466be/src/main/java/io/lettuce/core/cluster/RedisClusterClient.java#L1025-L1032 to filter out FAIL and EVENTUAL_FAIL nodes and achieved the same recovery after the periodic-topology-refresh interval.

0reactions
jhmartincommented, Jan 7, 2022

Adding .nodeFilter(it -> ! (it.is(RedisClusterNode.NodeFlag.FAIL) || it.is(RedisClusterNode.NodeFlag.EVENTUAL_FAIL))) gained me the desired behavior.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Managing Data Distribution | GridGain Documentation
The partitions of each cache are distributed across all server nodes in accordance ... To create a node filter, implement IgnitePredicate<ClusterNode> and ...
Read more >
Run an AWS Glue job on a specific Amazon S3 partition
To filter on partitions in the AWS Glue Data Catalog, use a pushdown predicate. Unlike Filter transforms, pushdown predicates allow you to ...
Read more >
java - Library method to partition a collection by a predicate
Use Guava's Multimaps.index . Here is an example, which partitions a list of words into two parts: those which have length > 3...
Read more >
How to use async functions with Array.filter in Javascript
The filter function keeps only the elements that pass a condition. It gets a function, this time it's called a predicate, and this...
Read more >
Filtering collections - Kotlin
In Kotlin, filtering conditions are defined by predicates ... function – partition() – filters a collection by a predicate and keeps the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found