question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spring App unable to connect to NATS in K8S (works in Docker)

See original GitHub issue

Issue reported by https://github.com/hsarena in slack, repo with issue at: https://github.com/jibitters/notifier

Java client NATS running under the following scenario appears to not being able to finish connecting properly.

Kubernetes: 1.16.3 CNI: Flannel NATS Operator: 0.6.0 NATS: 2.1.2 jnats: 2.6.6

Exception:

Error creating bean with name 'createConnection' defined in class path resource [ir/jibit/notifier/config/nats/NatsConfiguration.class]: 
Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.nats.client.Connection]: Factory method 'createConnection' threw exception; nested exception is java.io.IOException: Unable to connect to NATS servers: nats://nats:4222, 10.244.0.135:4222, 10.244.2.115:4222, 10.244.1.131:4222.

Taking a tcpdump from the network traffic and tracing, it looks like the client is being able to receive the INFO protocol, although not sending/flushing the CONNECT and PING protocols commands afterwards

12:06:40.049213 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [S], seq 3288009728, win 28200, options [mss 1410,sackOK,TS val 4019544892 ecr 0,nop,wscale 2], length 0

12:06:40.049857 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [S.], seq 4057289085, ack 3288009729, win 27960, options [mss 1410,sackOK,TS val 4018793430 ecr 4019544892,nop,wscale 7], length 0

12:06:40.049888 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 1, win 7050, options [nop,nop,TS val 4019544893 ecr 4018793430], length 0

# INFO protocol received, TCP connection itself appears to be OK
12:06:40.050768 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [P.], seq 1:366, ack 1, win 219, options [nop,nop,TS val 4018793431 ecr 4019544893], length 365
INFO {"server_id":"NAVQVNJDRG7NJOPQ3H6OFIR4KJCMOBDWLBK4NEFMNKDWJ6ADKSRV724F","server_name":"NAVQVNJDRG7NJOPQ3H6OFIR4KJCMOBDWLBK4NEFMNKDWJ6ADKSRV724F","version":"2.1.2","proto":1,"git_commit":"679beda","go":"go1.12.13","host":"0.0.0.0","port":4222,"max_payload":1048576,"client_id":394,"connect_urls":["10.244.0.135:4222","10.244.2.115:4222","10.244.1.131:4222"]} 

12:06:40.050783 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 366, win 7318, options [nop,nop,TS val 4019544894 ecr 4018793431], length 0

# Connection closing due to internal deadline from client
12:06:42.039894 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [F.], seq 1, ack 366, win 7318, options [nop,nop,TS val 4019546883 ecr 4018793431], length 0

12:06:42.040843 IP 10.107.16.129.4222 > 10.244.2.164.49584: Flags [F.], seq 366, ack 2, win 219, options [nop,nop,TS val 4018795421 ecr 4019546883], length 0

12:06:42.040868 IP 10.244.2.164.49584 > 10.107.16.129.4222: Flags [.], ack 367, win 7318, options [nop,nop,TS val 4019546884 ecr 4018795421], length 0

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:2
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
wallyqscommented, Dec 7, 2019

@alimate glad to hear you found the root cause! ok will close then

0reactions
alimatecommented, Dec 7, 2019

@wallyqs We were using a custom ThreadPoolExecutor to mimic a fixed thread pool like behavior:

val loopExecutor = ThreadPoolExecutor(properties.poolSize, properties.poolSize
, 0, MILLISECONDS,
LinkedBlockingQueue(), PrefixedThreadFactory(properties.threadPrefix))

In our K8S setup, we had one thread executing the main event loop. Apparently, the main thread gets blocked in initial ping/pong. Therefore, other events queued inside that LinkedBlockingQueue. Since that queue never fills up, we never get a chance to send the CONNECT command afterward. Hence the connection timeout!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error while running Springboot app in Kubernetes in Docket ...
Tested the docker image on local (windows 10 professional) machine and it worked. Then I tried to run the image using kubectl run...
Read more >
NATS Messaging - ThinkMicroservices.com
ImplementING the Services. We will be using Spring Boot, JNATS as our Java NATS client, along with the Docker-ized NATS cluster we've created....
Read more >
NATS and Docker - NATS Docs
To use the Docker container image, install Docker and pull the public image: docker pull nats. Run the NATS server image: docker run...
Read more >
Troubleshooting Router Error Responses | Pivotal Docs
If Gorouter is continually missing deregister messages, it might be because either the NATS message bus or the Gorouters are overwhelmed. Look ...
Read more >
Codefresh On-Premises Upgrade
Use the Kubernetes Codefresh Installer to upgrade the Codefresh ... [debug] cannot patch "cf-nats" with kind StatefulSet: StatefulSet.apps "cf-nats" is ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found