Spring App unable to connect to NATS in K8S (works in Docker)
See original GitHub issueIssue reported by https://github.com/hsarena in slack, repo with issue at: https://github.com/jibitters/notifier
Java client NATS running under the following scenario appears to not being able to finish connecting properly.
Kubernetes: 1.16.3 CNI: Flannel NATS Operator: 0.6.0 NATS: 2.1.2 jnats: 2.6.6
Exception:
Error creating bean with name 'createConnection' defined in class path resource [ir/jibit/notifier/config/nats/NatsConfiguration.class]:
Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.nats.client.Connection]: Factory method 'createConnection' threw exception; nested exception is java.io.IOException: Unable to connect to NATS servers: nats://nats:4222, 10.244.0.135:4222, 10.244.2.115:4222, 10.244.1.131:4222.
Taking a tcpdump from the network traffic and tracing, it looks like the client is being able to receive the INFO protocol, although not sending/flushing the CONNECT and PING protocols commands afterwards
12:06:40.049213 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [S], seq 3288009728, win 28200, options [mss 1410,sackOK,TS val 4019544892 ecr 0,nop,wscale 2], length 0
12:06:40.049857 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [S.], seq 4057289085, ack 3288009729, win 27960, options [mss 1410,sackOK,TS val 4018793430 ecr 4019544892,nop,wscale 7], length 0
12:06:40.049888 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 1, win 7050, options [nop,nop,TS val 4019544893 ecr 4018793430], length 0
# INFO protocol received, TCP connection itself appears to be OK
12:06:40.050768 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [P.], seq 1:366, ack 1, win 219, options [nop,nop,TS val 4018793431 ecr 4019544893], length 365
INFO {"server_id":"NAVQVNJDRG7NJOPQ3H6OFIR4KJCMOBDWLBK4NEFMNKDWJ6ADKSRV724F","server_name":"NAVQVNJDRG7NJOPQ3H6OFIR4KJCMOBDWLBK4NEFMNKDWJ6ADKSRV724F","version":"2.1.2","proto":1,"git_commit":"679beda","go":"go1.12.13","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":394,"connect_urls":["10.244.0.135:4222","10.244.2.115:4222","10.244.1.131:4222"]}
12:06:40.050783 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 366, win 7318, options [nop,nop,TS val 4019544894 ecr 4018793431], length 0
# Connection closing due to internal deadline from client
12:06:42.039894 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [F.], seq 1, ack 366, win 7318, options [nop,nop,TS val 4019546883 ecr 4018793431], length 0
12:06:42.040843 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [F.], seq 366, ack 2, win 219, options [nop,nop,TS val 4018795421 ecr 4019546883], length 0
12:06:42.040868 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 367, win 7318, options [nop,nop,TS val 4019546884 ecr 4018795421], length 0
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Error while running Springboot app in Kubernetes in Docket ...
Tested the docker image on local (windows 10 professional) machine and it worked. Then I tried to run the image using kubectl run...
Read more >NATS Messaging - ThinkMicroservices.com
ImplementING the Services. We will be using Spring Boot, JNATS as our Java NATS client, along with the Docker-ized NATS cluster we've created....
Read more >NATS and Docker - NATS Docs
To use the Docker container image, install Docker and pull the public image: docker pull nats. Run the NATS server image: docker run...
Read more >Troubleshooting Router Error Responses | Pivotal Docs
If Gorouter is continually missing deregister messages, it might be because either the NATS message bus or the Gorouters are overwhelmed. Look ...
Read more >Codefresh On-Premises Upgrade
Use the Kubernetes Codefresh Installer to upgrade the Codefresh ... [debug] cannot patch "cf-nats" with kind StatefulSet: StatefulSet.apps "cf-nats" is ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@alimate glad to hear you found the root cause! ok will close then
@wallyqs We were using a custom
ThreadPoolExecutor
to mimic afixed thread pool
like behavior:In our K8S setup, we had one thread executing the main event loop. Apparently, the main thread gets blocked in initial ping/pong. Therefore, other events queued inside that
LinkedBlockingQueue
. Since that queue never fills up, we never get a chance to send theCONNECT
command afterward. Hence the connection timeout!