question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Lettuce not able to reconnect automatically to SSL+authenticated ElastiCache node

See original GitHub issue

Bug Report

I raised this initially at https://github.com/spring-projects/spring-boot/issues/19436 however I have managed to get trace logs now, and it seems more suitable to raise here.

Current Behavior

When our AWS ElastiCache primary Redis node is restarted, Lettuce’s automatic reconnection doesn’t seem to leave connections in an authenticated, usable state. We just get repeated NOAUTH Authentication Required errors on the connections which don’t appear to be recoverable.

It seems I can replicate this reliably by restarting the master node from AWS Console, but have not been able to replicate it with a local test, local Redis within Docker Compose.

Relevant snippet is below - full logs with trace information at https://gist.github.com/chadlwilson/a35bd624775c278dc4bdfe7d2347b8c5

{"ts":"2020-01-05T22:57:24.962+08:00","level":"INFO","thread":"lettuce-eventExecutorLoop-1-3","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"Reconnecting, last destination was master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379","throwable":{}}
{"ts":"2020-01-05T22:57:24.964+08:00","level":"DEBUG","thread":"lettuce-eventExecutorLoop-1-3","logger":"io.lettuce.core.protocol.ReconnectionHandler","message":"Reconnecting to Redis at master.redis.oiwzdu.apse1.cache.amazonaws.com:6379","throwable":{}}
{"ts":"2020-01-05T22:57:24.993+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, [id: 0x42742e08] (inactive), chid=0x2] channelRegistered()","throwable":{}}
{"ts":"2020-01-05T22:57:25.029+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] channelActive()","throwable":{}}
{"ts":"2020-01-05T22:57:25.029+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] activateEndpointAndExecuteBufferedCommands 0 command(s) buffered","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] activating endpoint","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() writeAndFlush command TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] write(ctx, TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand], promise)","throwable":{}}
{"ts":"2020-01-05T22:57:25.031+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] writing command TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.031+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] Sent: *2\r\n$4\r\nAUTH\r\n$16\r\nREDACTED","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() done","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] flushCommands()","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] flushCommands() Flushing 0 commands","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] channelActive()","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] channelActive() done","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] userEventTriggered(ctx, SslHandshakeCompletionEvent(SUCCESS))","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"INFO","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ReconnectionHandler","message":"Reconnected to master.redis.oiwzdu.apse1.cache.amazonaws.com:6379, Channel channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379","throwable":{}}
{"ts":"2020-01-05T22:57:25.035+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] userEventTriggered(ctx, io.lettuce.core.ConnectionEvents$Activated@1e686cf1)","throwable":{}}
{"ts":"2020-01-05T22:57:25.035+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Received: 47 bytes, 1 commands in the stack","throwable":{}}
{"ts":"2020-01-05T22:57:25.036+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -ERR Client sent AUTH, but no password is set","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Stack contains: 1 commands","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decode LatencyMeteredCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.TracedCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decoded LatencyMeteredCommand [type=AUTH, output=StatusOutput [output=null, error='ERR Client sent AUTH, but no password is set'], commandType=io.lettuce.core.protocol.TracedCommand], empty stack: true","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"XNIO-2 task-16","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() writeAndFlush command TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"XNIO-2 task-16","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() done","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] write(ctx, TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand], promise)","throwable":{}}
{"ts":"2020-01-05T22:57:30.655+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] writing command TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.655+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] Sent: *1\r\n$4\r\nINFO","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Received: 34 bytes, 1 commands in the stack","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -NOAUTH Authentication required.","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Stack contains: 1 commands","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decode LatencyMeteredCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.TracedCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decoded LatencyMeteredCommand [type=INFO, output=StatusOutput [output=null, error='NOAUTH Authentication required.'], commandType=io.lettuce.core.protocol.TracedCommand], empty stack: true","throwable":{}}

Then repeatedly logs the below on attempts to use the connection -

i.l.c.RedisCommandExecutionException: NOAUTH Authentication required.
	at i.l.c.ExceptionFactory.createExecutionException(ExceptionFactory.java:135)
	at i.l.c.ExceptionFactory.createExecutionException(ExceptionFactory.java:108)
	at i.l.c.p.AsyncCommand.completeResult(AsyncCommand.java:120)
	at i.l.c.p.AsyncCommand.complete(AsyncCommand.java:111)
	at i.l.c.p.CommandWrapper.complete(CommandWrapper.java:59)
	at i.l.c.p.CommandWrapper.complete(CommandWrapper.java:59)
	at i.l.c.p.CommandHandler.complete(CommandHandler.java:654)
	at i.l.c.p.CommandHandler.decode(CommandHandler.java:614)
	at i.l.c.p.CommandHandler.channelRead(CommandHandler.java:565)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at i.n.c.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	at i.n.h.ssl.SslHandler.unwrap(SslHandler.java:1478)
	at i.n.h.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1227)
	at i.n.h.ssl.SslHandler.decode(SslHandler.java:1274)
	at i.n.h.c.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503)
	at i.n.h.c.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
	at i.n.h.c.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at i.n.c.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
	at i.n.c.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
	at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
	at i.n.c.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
	at i.n.c.e.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
	at i.n.c.e.EpollEventLoop.processReady(EpollEventLoop.java:502)
	at i.n.c.e.EpollEventLoop.run(EpollEventLoop.java:407)
	at i.n.u.c.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
	at i.n.u.i.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at i.n.u.c.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	... 1 common frames omitted
Wrapped by: o.s.d.r.RedisSystemException: Error in execution; nested exception is io.lettuce.core.RedisCommandExecutionException: NOAUTH Authentication required.
	at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:54)
	at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:52)
	at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41)
	at o.s.d.r.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44)
	at o.s.d.r.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42)
	at o.s.d.r.c.l.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:270)
	at o.s.d.r.c.l.LettuceServerCommands.convertLettuceAccessException(LettuceServerCommands.java:571)
	at o.s.d.r.c.l.LettuceServerCommands.info(LettuceServerCommands.java:215)
	at o.s.d.r.c.DefaultedRedisConnection.info(DefaultedRedisConnection.java:1291)
	at o.s.b.a.r.RedisHealthIndicator.doHealthCheck(RedisHealthIndicator.java:64)
	at o.s.b.a.h.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
	at o.s.b.a.h.HealthIndicator.getHealth(HealthIndicator.java:37)
	at o.s.b.a.h.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)
	at o.s.b.a.h.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43)
	at o.s.b.a.h.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108)
	at o.s.b.a.h.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119)
	at o.s.b.a.h.HealthEndpointSupport.getContri...

Expected behavior/code

I would expect the reconnect to be handled cleanly and be able to authenticate properly.

From looking at the logs, it seems that during reconnection, Lettuce receives a response from Redis/AWS ElastiCache that indicates it should no longer send the password/auth token for future attempts.

Environment

  • Lettuce version(s): 5.2.1.RELEASE
  • Redis version: 5.0.5 (AWS ElasticCache)

Being used within a Spring Boot/Spring Data Redis project

  • Spring Boot 2.2.2.RELEASE
  • Spring Data Redis 2.2.3.RELEASE
  • Spring Framework 5.2.2.RELEASE

Settings

  • SSL enabled
  • Authentication required

Possible Solution

None known.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
mp911decommented, Jan 15, 2020

Yeah, makes sense. I filed a ticket for it.

1reaction
chadlwilsoncommented, Jan 15, 2020

Thanks a lot for the suggestion! I have tried that out and it seems to workaround the problem which is great. The log for this working is at https://gist.github.com/chadlwilson/bcbf2f13964dd31e0117b16f9de6f073

    @Bean
    @ConditionalOnProperty("spring.redis.ping-before-activate-connection")
    @SuppressWarnings("deprecation") // Suggested in https://github.com/lettuce-io/lettuce-core/issues/1201
    public LettuceClientConfigurationBuilderCustomizer pingBuilderCustomizer() {
        log.info("Redis Connection ping-before-activate enabled");
        return builder -> builder.clientOptions(
            // Default ClientOptions values here were propagated from LettuceClientConfigurationBuilder since
            // it doesn't really allow the ClientOptions builder itself to be injected.
            ClientOptions
                .builder()
                .timeoutOptions(TimeoutOptions.enabled())
                .pingBeforeActivateConnection(true)
                .build()
        );
    }

In the meantime, AWS Support have got back to me and said that they can replicate the issue and have raised to the ElastiCache team for further discussion/investigation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot connecting to an ElastiCache for Redis cluster
I can't connect to my Amazon ElastiCache for Redis cluster. How can I troubleshoot this? Short description. Connectivity issues can have ...
Read more >
Lettuce client configuration to connect to aws redis elasticache
I believe lettuce is auto discovering the cluster and trying to write to replica node. Can someone help me how should I integrate...
Read more >
Lettuce Reference Guide
Specifying multiple seed nodes helps to improve resiliency as lettuce is able to connect the cluster even if a seed node is not...
Read more >
elasticache with lettuce - Google Groups
It seems this is a right time to move to lettuce(instead of jedis), however I am not finding any example code in java...
Read more >
Configuring Spring Cache Manager with AWS ElastiCache ...
When you are working with single Redis node with no slaves, ... of the master node to the application and Lettuce client will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found