Lettuce not able to reconnect automatically to SSL+authenticated ElastiCache node
See original GitHub issueBug Report
I raised this initially at https://github.com/spring-projects/spring-boot/issues/19436 however I have managed to get trace logs now, and it seems more suitable to raise here.
Current Behavior
When our AWS ElastiCache primary Redis node is restarted, Lettuce’s automatic reconnection doesn’t seem to leave connections in an authenticated, usable state. We just get repeated NOAUTH Authentication Required
errors on the connections which don’t appear to be recoverable.
It seems I can replicate this reliably by restarting the master node from AWS Console, but have not been able to replicate it with a local test, local Redis within Docker Compose.
Relevant snippet is below - full logs with trace information at https://gist.github.com/chadlwilson/a35bd624775c278dc4bdfe7d2347b8c5
{"ts":"2020-01-05T22:57:24.962+08:00","level":"INFO","thread":"lettuce-eventExecutorLoop-1-3","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"Reconnecting, last destination was master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379","throwable":{}}
{"ts":"2020-01-05T22:57:24.964+08:00","level":"DEBUG","thread":"lettuce-eventExecutorLoop-1-3","logger":"io.lettuce.core.protocol.ReconnectionHandler","message":"Reconnecting to Redis at master.redis.oiwzdu.apse1.cache.amazonaws.com:6379","throwable":{}}
{"ts":"2020-01-05T22:57:24.993+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, [id: 0x42742e08] (inactive), chid=0x2] channelRegistered()","throwable":{}}
{"ts":"2020-01-05T22:57:25.029+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] channelActive()","throwable":{}}
{"ts":"2020-01-05T22:57:25.029+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] activateEndpointAndExecuteBufferedCommands 0 command(s) buffered","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] activating endpoint","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() writeAndFlush command TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.030+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] write(ctx, TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand], promise)","throwable":{}}
{"ts":"2020-01-05T22:57:25.031+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] writing command TracedCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.031+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] Sent: *2\r\n$4\r\nAUTH\r\n$16\r\nREDACTED","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() done","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] flushCommands()","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] flushCommands() Flushing 0 commands","throwable":{}}
{"ts":"2020-01-05T22:57:25.033+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] channelActive()","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] channelActive() done","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] userEventTriggered(ctx, SslHandshakeCompletionEvent(SUCCESS))","throwable":{}}
{"ts":"2020-01-05T22:57:25.034+08:00","level":"INFO","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ReconnectionHandler","message":"Reconnected to master.redis.oiwzdu.apse1.cache.amazonaws.com:6379, Channel channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379","throwable":{}}
{"ts":"2020-01-05T22:57:25.035+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.ConnectionWatchdog","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, last known addr=master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] userEventTriggered(ctx, io.lettuce.core.ConnectionEvents$Activated@1e686cf1)","throwable":{}}
{"ts":"2020-01-05T22:57:25.035+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Received: 47 bytes, 1 commands in the stack","throwable":{}}
{"ts":"2020-01-05T22:57:25.036+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -ERR Client sent AUTH, but no password is set","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Stack contains: 1 commands","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decode LatencyMeteredCommand [type=AUTH, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.TracedCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:25.038+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decoded LatencyMeteredCommand [type=AUTH, output=StatusOutput [output=null, error='ERR Client sent AUTH, but no password is set'], commandType=io.lettuce.core.protocol.TracedCommand], empty stack: true","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"XNIO-2 task-16","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() writeAndFlush command TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"XNIO-2 task-16","logger":"io.lettuce.core.protocol.DefaultEndpoint","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, epid=0x1] write() done","throwable":{}}
{"ts":"2020-01-05T22:57:30.654+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] write(ctx, TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand], promise)","throwable":{}}
{"ts":"2020-01-05T22:57:30.655+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] writing command TracedCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.AsyncCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.655+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandEncoder","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379] Sent: *1\r\n$4\r\nINFO","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Received: 34 bytes, 1 commands in the stack","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -NOAUTH Authentication required.","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Stack contains: 1 commands","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decode LatencyMeteredCommand [type=INFO, output=StatusOutput [output=null, error='null'], commandType=io.lettuce.core.protocol.TracedCommand]","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"DEBUG","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.RedisStateMachine","message":"Decoded LatencyMeteredCommand [type=INFO, output=StatusOutput [output=null, error='NOAUTH Authentication required.'], commandType=io.lettuce.core.protocol.TracedCommand], empty stack: true","throwable":{}}
Then repeatedly logs the below on attempts to use the connection -
i.l.c.RedisCommandExecutionException: NOAUTH Authentication required.
at i.l.c.ExceptionFactory.createExecutionException(ExceptionFactory.java:135)
at i.l.c.ExceptionFactory.createExecutionException(ExceptionFactory.java:108)
at i.l.c.p.AsyncCommand.completeResult(AsyncCommand.java:120)
at i.l.c.p.AsyncCommand.complete(AsyncCommand.java:111)
at i.l.c.p.CommandWrapper.complete(CommandWrapper.java:59)
at i.l.c.p.CommandWrapper.complete(CommandWrapper.java:59)
at i.l.c.p.CommandHandler.complete(CommandHandler.java:654)
at i.l.c.p.CommandHandler.decode(CommandHandler.java:614)
at i.l.c.p.CommandHandler.channelRead(CommandHandler.java:565)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
at i.n.c.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
at i.n.h.ssl.SslHandler.unwrap(SslHandler.java:1478)
at i.n.h.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1227)
at i.n.h.ssl.SslHandler.decode(SslHandler.java:1274)
at i.n.h.c.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503)
at i.n.h.c.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
at i.n.h.c.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
at i.n.c.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
at i.n.c.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at i.n.c.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
at i.n.c.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
at i.n.c.e.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792)
at i.n.c.e.EpollEventLoop.processReady(EpollEventLoop.java:502)
at i.n.c.e.EpollEventLoop.run(EpollEventLoop.java:407)
at i.n.u.c.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
at i.n.u.i.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at i.n.u.c.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 common frames omitted
Wrapped by: o.s.d.r.RedisSystemException: Error in execution; nested exception is io.lettuce.core.RedisCommandExecutionException: NOAUTH Authentication required.
at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:54)
at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:52)
at o.s.d.r.c.l.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41)
at o.s.d.r.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44)
at o.s.d.r.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42)
at o.s.d.r.c.l.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:270)
at o.s.d.r.c.l.LettuceServerCommands.convertLettuceAccessException(LettuceServerCommands.java:571)
at o.s.d.r.c.l.LettuceServerCommands.info(LettuceServerCommands.java:215)
at o.s.d.r.c.DefaultedRedisConnection.info(DefaultedRedisConnection.java:1291)
at o.s.b.a.r.RedisHealthIndicator.doHealthCheck(RedisHealthIndicator.java:64)
at o.s.b.a.h.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
at o.s.b.a.h.HealthIndicator.getHealth(HealthIndicator.java:37)
at o.s.b.a.h.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)
at o.s.b.a.h.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43)
at o.s.b.a.h.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108)
at o.s.b.a.h.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119)
at o.s.b.a.h.HealthEndpointSupport.getContri...
Expected behavior/code
I would expect the reconnect to be handled cleanly and be able to authenticate properly.
From looking at the logs, it seems that during reconnection, Lettuce receives a response from Redis/AWS ElastiCache that indicates it should no longer send the password/auth token for future attempts.
Environment
- Lettuce version(s):
5.2.1.RELEASE
- Redis version:
5.0.5
(AWS ElasticCache)
Being used within a Spring Boot/Spring Data Redis project
- Spring Boot
2.2.2.RELEASE
- Spring Data Redis
2.2.3.RELEASE
- Spring Framework
5.2.2.RELEASE
Settings
- SSL enabled
- Authentication required
Possible Solution
None known.
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (4 by maintainers)
Top GitHub Comments
Yeah, makes sense. I filed a ticket for it.
Thanks a lot for the suggestion! I have tried that out and it seems to workaround the problem which is great. The log for this working is at https://gist.github.com/chadlwilson/bcbf2f13964dd31e0117b16f9de6f073
In the meantime, AWS Support have got back to me and said that they can replicate the issue and have raised to the ElastiCache team for further discussion/investigation.