question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AssertionError in Mailbox.recv() causes context thread to stay running

See original GitHub issue

I am running a complex unit test locally that involves 5 nodes with various PUB/SUB and REQ/RES pairs. They’re all talking via the public IP of the computer, but everything is on the same machine, in the same JVM.

Each endpoint has its own ZMQ.Context because it’s a simulation of nodes that could be put on different computers across a network.

I’m running multithreaded. Each thread has its own ZMQ.Context object, along with its own socket (in any of the modes listed above). At the end of the test, I try to shut everything down gracefully, but I get an intermittent AssertionError:

Exception in thread "MessageSubscriber: CharlieOne" java.lang.AssertionError
    at zmq.Mailbox.recv(Mailbox.java:114)
    at zmq.SocketBase.process_commands(SocketBase.java:793)
    at zmq.SocketBase.recv(SocketBase.java:714)
    at org.zeromq.ZMQ$Socket.recv(ZMQ.java:1247)
    at org.zeromq.ZMQ$Socket.recv(ZMQ.java:1235)
    at MessageSubscriber$ListenerThread.run(MessageSubscriber.java:131)

“MessageSubscriber: CharlieOne” is the test thread that contains a ZMQ subscriber. The code that causes the AssertionError is here.

Here is the code that runs in the class that owns that thread, to shut down the thread:

private ZMQ.Context context = ZMQ.context(1);
private ZMQ.Socket publisher = context.socket(ZMQ.PUB);

...

// Shut down the thread
listenerThread.run = false;
subscriber.close();
context.term();
try {
    listenerThread.join();
} catch (InterruptedException ex) {
    logger.warn("InterruptedException in thread.join()");
}

After the AssertionError occurs, context.term() blocks indefinitely, perhaps because the socket never really closed. I’ve tried catching the AssertionError, but that doesn’t help. The term() call still blocks indefinitely.

The problem doesn’t always happen. Sometimes the whole test runs, all the threads shut down, and the program exits. Most of the time, however, at least one of the threads blocks because of this assertion.

Issue Analytics

  • State:closed
  • Created 9 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
colinicommented, Aug 1, 2014

It’s hard to say what’s wrong w/o seeing your code. But I’ve run into threads blocking during shutdown many times. This is the safest way I’ve found to shutdown zeromq sockets and the context:

2 threads: Main and SocketThread. Main creates the ZContext, creates SocketThread and passes it the context, and starts SocketThread. ZContext is thread safe so it doesn’t matter which thread creates it.

SocketThread uses the context to create some zeromq sockets and does work.

Later at shutdown time Main needs to signal SocketThread to shutdown. Do this however you want (set a volatile shutdown boolean, send it a ‘shutdown’ message on one of its zeromq sockets, whatever.) Once Main sends the shutdown signal it must join() on SocketThread and wait. SocketThread must close every socket it created and then exit.

Now Main can safely terminate the context.

In your case I’d try putting the context.term() in a finally block in the exception so it is called after join() returns. Be sure that no thread ever touches another thread’s sockets. And make sure every socket is closed and worker thread has exited before calling context.term().

HTH

0reactions
daveyarwoodcommented, Aug 26, 2017

It is unclear if this is still an issue in the latest version of JeroMQ. Closing for now – if anyone observes this problem on the latest version, please open a new issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Assertion error polling from inside a new thread in ZeroMQ ...
I would have multiple clients sending messages to the server over tcp, but not have multiple threads on the server write to the...
Read more >
Programmatic usage — aiosmtpd 1.5.0a2 documentation
The SMTP thread might run into errors during its setup phase; to catch this the main thread will timeout when waiting for the...
Read more >
Python Multiprocessing: The Complete Guide
Running the example first creates an instance of the process, then executes the content of the run() function. Meanwhile, the main thread waits ......
Read more >
Support alternative start methods in multiprocessing on Unix.
-import time -import multiprocessing -import threading -import queue ... 'STOP': - result = c.recv() - - elapsed = _timer() - t - p.join() ......
Read more >
Windows Analysis Report python-2.7.18.msi - Joe Sandbox
http://mail.python.org/pipermail/python-dev/2010-January/095637.html ... If the os.fork() function is not present (e.g. on Windows),..os.popen2() is used as ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found