Make adaptive topology refresh better usable for failover/master-slave promotion changes
See original GitHub issueI have a redis cluster with 3 shards. Each shard has 2 nodes, 1 primary and 1 replica. I’m using lettuce 4.3.2.Final and following is the configuration im using to create redis client.
redisClusterClient = RedisClusterClient.create(RedisURI.builder()
.withHost(hostName)
.withPort(port)
.withTimeout(timeout, TimeUnit.MILLISECONDS)
.build());
redisClusterClient.setOptions(ClusterClientOptions.builder()
.autoReconnect(true)
.cancelCommandsOnReconnectFailure(true)
.disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
.topologyRefreshOptions(
ClusterTopologyRefreshOptions.builder().enableAllAdaptiveRefreshTriggers().build())
.build());
redisClusterConnection = new SlaveReadingLettuceClusterConnection(redisClusterClient,
enableCompression);
Inside SlaveReadingLettuceClusterConnection
StatefulRedisClusterConnection<byte[], byte[]> connection;
if(enableCompression) {
connection = clusterClient.connect(CompressionCodec.valueCompressor(new ByteArrayCodec(), CompressionCodec.CompressionType.GZIP));
} else {
connection = (StatefulRedisClusterConnection<byte[], byte[]>) super.doGetAsyncDedicatedConnection();
}
connection.setReadFrom(ReadFrom.SLAVE);
return connection;
So I’m using all adaptive refresh triggers enabled, and not specifying any periodic refresh trigger for topology. We recently had an issue where one of the primary nodes in a shard of the cluster had problem, which triggered failover. So the shard had two nodes 001 (primary) and 002 (replica). 001 failed-over, and 002 became primary. When 001 recovered, it became replica. My assumption was that the adaptive refresh triggers would kick in, and update the topology upon recovery. It didn’t happen, we extracted the partitions/topology of the redis client that was being printed in exceptions.
Partitions [
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='null',
pingSentTimestamp=0,
pongReceivedTimestamp=1513721289619,
configEpoch=0,
flags=[MASTER],
slot count=5461],
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='*****************************************,',
pingSentTimestamp=0,
pongReceivedTimestamp=1513721290120,
configEpoch=2,
flags=[SLAVE],
slot count=0],
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='*****************************************,',
pingSentTimestamp=0,
pongReceivedTimestamp=1513721291124,
configEpoch=0,
flags=[SLAVE],
slot count=0],
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='null',
pingSentTimestamp=0,
pongReceivedTimestamp=0,
configEpoch=2,
flags=[MYSELF, MASTER],
slot count=5462],
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='null',
pingSentTimestamp=0,
pongReceivedTimestamp=1513721292129,
configEpoch=3,
flags=[MASTER],
slot count=5461],
RedisClusterNodeSnapshot [
uri=RedisURI [host=*.*.*.*, port=****],
nodeId=*****************************************,
connected=true,
slaveOf='null',
pingSentTimestamp=0,
pongReceivedTimestamp=1513721291051,
configEpoch=3,
flags=[MASTER],
slot count=0]]
4 out 6 nodes above are master, while there were only 3. So in the troubled shard, there were two nodes, and both were recognized as primary by the redis client. Since we had configured its read policy as SLAVE, it was throwing the exception Cannot determine a partition to read for slot ****
. Even though on node had recovered and become a replica, the topology had not refreshed.
P.S. We are using AWS setup so redis cluster was AWS Elasticache, and our application was deployed in AWS Elasticbeanstalk (Java, Tomcat Stack). The EB environment had 15 EC2 instances configured behind elastic load balancer, and we faced issue in only 2 of the EC2 instances.
The quick fix we applied was to update to lettuce 4.4.1 and use read policy SLAVE_PREFERRED
. But we are not sure why the adaptive refresh triggers didn’t work.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:3
- Comments:29 (17 by maintainers)
Top GitHub Comments
#333 isn’t saying to disable periodic triggers. Let’s turn this ticket into an enhancement for adaptive triggers to make it better usable for failovers – basically either delaying refresh or scheduling subsequent runs to make sure refresh grabs the appropriate state. This requires some conceptual design before we can implement something.
Scenario A master node leaves the cluster, as consequence of a failure. The redis node stops running and it is up to the cluster to detect it and promote one of its replica as master.
Issue The lettuce client doesn’t react to the master failover in a cluster. master failover == a master fails, the cluster detects it and promotes a slave as master. The client keeps trying to reconnect to the node that doesn’t exist anymore when the either the cluster and the internal
partition
data object has detected the new master.