question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Concurrent SSL Handshakes cause IO threads to stay busy for long time

See original GitHub issue

Expected behavior

Existing connections should not be impacted and should continue to exchange messages if new connections are added

Actual behavior

If there are a few thousand connections to the server, and then we get a burst of say another few thousand connections, these new SSL handshakes keep the worker threads busy for a long time, which causes the existing connections to start timing out.

Steps to reproduce

  1. Compile and build the reproducer code using mvn clean package assembly:single

  2. Start the tcp server using this command: java -jar tcp-server-1.0.0-SNAPSHOT-jar-with-dependencies.jar This starts a TCP server with 8 IO threads and 1 acceptor thread on port 8000. For customizing, look at the class TcpServer

  3. Open mission control and monitor the Connection Count using the TcpServer.getConnectionCount managed attribute. This should be 0 to start with

  4. On a second machine, trigger the load using the tcp client jar. java -jar tcp-client-1.0.0-SNAPSHOT-jar-with-dependencies.jar <server-host> This will trigger 5000 concurrent handshakes to the server. These connections are kept open and the client keeps sending requests to the server (1 at a time). These values can also be customized using the LoadRunner class

  5. Monitor for a couple of minutes, the connection count reported would be 5000

  6. On a third machine, repeat step 4 and monitor the connection count. You will see the connection count to rise to 10000, and after a couple of minutes the clients start timing out (graph attached)

Minimal yet complete reproducer code (or URL to code)

https://github.com/anshuman-osc/netty.git

Netty version

4.1.6.Final, 4.1.12.Final

JVM version (e.g. java -version)

java version “1.8.0_65” Java™ SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot™ 64-Bit Server VM (build 25.65-b01, mixed mode)

OS version (e.g. uname -a)

Tested on 2 Linux distributions: Linux 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Linux 3.0.101-63-default #1 SMP Tue Jun 23 16:02:31 UTC 2015 (4b89d0c) x86_64 x86_64 x86_64 GNU/Linux io-threads-blocked-counts

missioncontrol-graph io-threads-blocked-counts

Stack overflow thread

https://stackoverflow.com/questions/44751058/multiple-worker-event-loop-groups

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:27 (19 by maintainers)

github_iconTop GitHub Comments

4reactions
normanmaurercommented, Oct 11, 2019

As of today we support offloading handshake stuff from the EventLoop by specify an Executor when creating the SslHandler. Closing this

2reactions
normanmaurercommented, Oct 6, 2017

I still have this issue on my to do list but just had not cycles yet 😦 In general we should just “correctly” support an Executor here as SSLEngine.getTask() already allow doing this.

On 6. Oct 2017, at 17:37, Roger notifications@github.com wrote:

In my mind… Split SslHandler into two classes. Let there be a new SslHandshakeHandler and some class that does everything but the handshake portion. Wrap these two into a codec style class for the convenient path and give people to use the handlers separately.

Then use ChannelPipeline#add(EventExecutorGroup, SslHandshakeHandler) if you want the handshakes to happen on a different thread than the default.

That is obviously an API breaking change. For compatibility this could be baked into SslHandler which could take an optional configuration parameter in the form of an executor.

Just throwing it out there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/netty/netty/issues/7020#issuecomment-334791085, or mute the thread https://github.com/notifications/unsubscribe-auth/AAa0QqF25oxKOwYJddEZfNbI6P9HrE3Rks5spklHgaJpZM4OiTMP.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What Is SSL Handshake & How Do I Fix SSL ... - HubSpot Blog
If the system time and date is incorrect on your computer or device, then it can cause the SSL Handshake Failed error.
Read more >
Java bayeux client disconnects with TimeoutException when ...
When the system is idle, the client will timeout the connection before the heartbeat (the long poll) is responded, causing the errors you...
Read more >
Diagnosing the most common worker performance and JVM ...
Memory usage issues; Worker CPU credit depletion; Thread exhaustion. OBJECTIVE. When you have performance issues that results in JVM monitoring ...
Read more >
Secure Sockets Layer performance tips - IBM
When an SSL connection is established, an SSL handshake occurs.When a connection is made, SSL performs bulk encryption and decryption for each ...
Read more >
Elastic bamboo very unstable on Unicorn | Bamboo - Jira Atlassian
We can see some exceptions in the the server log and agent log of the 2 idle agents. ... Thread.run(Thread.java:619) Caused by: java.io.EOFException:...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found