question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IO thread loops infinitely on out event and blocks application with 100% CPU usage [was renamed]

See original GitHub issue

UPDATE: All assumptions in this comment turned out to be wrong. The true problem is described below in comment https://github.com/zeromq/jeromq/issues/520#issuecomment-364570301

I’m using the latest version 0.4.3 of JeroMQ and my application stops after some time because messages are not sent over a PUSH socket while one of two JeroMQ I/O threads is at 100% CPU usage. The thread remains inside epollWait, see http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/nio/ch/EPollArrayWrapper.java#269

I’ve seen issue https://github.com/zeromq/jeromq/issues/506 and think it’s very likely related. I don’t know about the defined behavior of epollWait and how it’s supposed to be used in JeroMQ, but I see in my application that the timeout argument is always zero whenever it hangs. As I thought it should be some positive timeout value (*) I replaced long timeout = executeTimers(); with long timeout = Math.max(10, executeTimers()); but it doesn’t solve the problem and the io thread still hangs forever inside epollWait. Since the thread hangs inside epollWait the selector recreate logic is not executed. Do you have any idea what’s wrong?

(*) I saw that JeroMQ implements a similar logic to workaround the JDK bug as netty; however, in netty epollWait is called with a timeout of 10.

In my application, I execute a test that uses four PULL sockets and various PUSH sockets. The test is started multiple times (because I do a bunch of simulations) in the same Java process but each time with a new context. After each test, sockets are closed and the context is closed gracefully.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:16 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
fredoboulocommented, Feb 14, 2018

Sorry, no time for me to be on this topic this week.

I will be fast now and will provide more context later, but as long as your PR satisfies the terms of C4, it will be merged.

The following is not the project’s thinkings but my very own: when the lib was bumped to 4.1.7, I decided to stick as much as possible to the logic of libzmq so it would be better to maintain. Not everything can be translated back to Java (ByteBuffer or equivalents are not used in C++ version), but the more the code sticks to it, the better I feel. To answer your question, I personally would go for both in the given order:

  1. make code work, with good performances
  2. be as close as possible to C++ version

In your PR, this comparison in EncoderBase makes a deviation compared to the C++ version, which I find hard to explain, while at the same time a code that is present only in Java to decide flipping the buffer or not seems to be the source of the bug you reported. If I was you, I would invest a bit of time to try to refine that Java-specific code. But I am not you 😃

If you can wait a week, I may be more available then.

0reactions
fredoboulocommented, Sep 10, 2018

SORRY @smattheis !

Read more comments on GitHub >

github_iconTop Results From Across the Web

c# - cpu usage increasing up to 100% in infinite loop in thread
In this method, I am checking for web request time out or new message for each Web Request which is also kept in...
Read more >
How to fix high Java CPU usage problems - TheServerSide.com
Infinite loops ​​ An infinite loop that does nothing but consume clock cycles is the result. If multiple threads hit this line of...
Read more >
Reactor 3 Reference Guide
Reactor is a fully non-blocking reactive programming foundation for the JVM, with efficient demand management (in the form of managing “backpressure”). It ...
Read more >
Python Multithreading and Multiprocessing Tutorial - Toptal
Threading is just one of the many ways concurrent programs can be built. In this article, we will take a look at threading...
Read more >
Using the ps command - IBM
Three of the possible ps output columns report CPU usage, ... shows a simple five-thread program with all the threads in an infinite...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found