question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ZMQ.Context.term() hangs even if setLinger(0) in presence of unstable network

See original GitHub issue

The following program expects a hartbeat from a PUB socket once a second (actually waits 2 seconds for margin); if it doesn’t receive it, it disconnects and reconnects the SUB socket. If I disconnect and reconnect the WiFi on the computer running it (with PUB on a separate computer) I can get the closing of the ZMQ.Context to hang forever. The logs also shows that somehow the socket connection and subscribing thinkgs it succeeds even though the network is disconnected (obviously the heartbeat is not received).

package it.awtech.awdoc.log_listener;

import org.zeromq.ZMQ;
import org.zeromq.ZMQ.Context;
import org.zeromq.ZMQ.Socket;

import java.util.Arrays;
import java.util.concurrent.CountDownLatch;

public class App2
{
    private static volatile boolean shutDown;
    private static final CountDownLatch shutDownLatch = new CountDownLatch(1);

    public static void main( String[] args ) {
        System.out.println("SUB started with args " + Arrays.toString(args));

        try (final Context context = ZMQ.context(1)) {
             try (final Socket subscriber = context.socket(ZMQ.SUB)) {
                 subscriber.setLinger(0);
                 subscriber.setSendTimeOut(2000);
                 subscriber.setReceiveTimeOut(2000);
                 Runtime.getRuntime().addShutdownHook(new Thread(() -> {
                     shutDown = true;
                     try {
                         shutDownLatch.await();
                     } catch (InterruptedException e) {
                         e.printStackTrace();
                     }
                 }));
                 boolean connected = false;
                 while (!shutDown) {
                     try {
                         if (!connected) {
                             try {
                                 subscriber.connect(args[0]);
                                 System.out.println("Connected to server");
                                 subscriber.subscribe("");
                                 System.out.println("Subscribed topic");
                                 connected = true;
                             } catch (Exception e) {
                                 System.out.println("Error connecting socket");
                                 e.printStackTrace();
                                 Thread.sleep(1000);
                                 continue;
                             }
                         }
                         // Ignore organization
                         String topic = subscriber.recvStr();
                         if (topic == null) {
                             System.out.println("Didn't get heartbeat, disconnecting.");
                             subscriber.disconnect(args[0]);
                             connected = false;
                         }
                     } catch (Exception e) {
                         if (shutDown) {
                             System.out.println("Exiting ZMQ thread");
                             break;
                         }
                         System.out.println("Error receiving message");
                         e.printStackTrace();
                     }
                 }
                 System.out.println("Exiting socket try with resources");
             }
            System.out.println("Exiting context try with resources");
        }
        System.out.println("Exited try with resources");
        shutDownLatch.countDown();
    }
}

This is the log for a run, lines starting with “###” are comments about when I connected or disconnected the wifi or stopped the program.

SUB started with args [tcp://devel.awdoc.com:10082]
Connected to server
Subscribed topic
### DISCONNECT WIFI ###
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
### RECONNECT WIFI ###
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Didn't get heartbeat, disconnecting.
Error connecting socket
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Error connecting socket
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Connected to server
Subscribed topic
### DISCONNECT WIFI ###
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Connected to server
Subscribed topic
Didn't get heartbeat, disconnecting.
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
### RECONNECT WIFI ###
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Error connecting socket
java.lang.IllegalArgumentException: java.net.UnknownHostException: devel.awdoc.com
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:119)
	at zmq.io.net.tcp.TcpAddress.<init>(TcpAddress.java:35)
	at zmq.io.net.Address.resolve(Address.java:96)
	at zmq.SocketBase.connect(SocketBase.java:543)
	at org.zeromq.ZMQ$Socket.connect(ZMQ.java:2531)
	at it.awtech.awdoc.log_listener.App2.main(App2.java:36)
Caused by: java.net.UnknownHostException: devel.awdoc.com
	at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
	at java.net.InetAddress.getAllByName(InetAddress.java:1192)
	at java.net.InetAddress.getAllByName(InetAddress.java:1126)
	at zmq.io.net.tcp.TcpAddress.resolve(TcpAddress.java:110)
	... 5 more
Connected to server
Subscribed topic
### STOP PROGRAM WITH SIGTERM ###
Exiting socket try with resources
Exiting context try with resources
### HANGS FOREVER HERE ###

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
lultimouomocommented, Sep 18, 2018

The PUB side uses libzmq version 4.1.4.

I implemented heartbeating because in the trials I experienced a hung connection (not receiving messages and not reconnecting) when connecting between SUB and PUB through a VPN; if the VPN drops the TCP keeps on living, and when the VPN comes up again it the zmq socket does not resume receiving messages. As I understand it this is the nature of TCP, and there needs to be some form of heartbeating to fix the problem.

Is the heartbeating included in the latest snapshot compatible with libzmq? Do you have pointers to some documentation on how to enable it?

0reactions
fredoboulocommented, Sep 18, 2018

… I remember the hearbeats were implemented in libzmq 4.2.x. If you have no access to the code of the PUB side it might be a deadend.

Anyhow, you can get some doc about it there: http://api.zeromq.org/4-2:zmq-setsockopt, look for ZMQ_HEARTBEAT_IVL, ZMQ_HEARTBEAT_TIMEOUT, ZMQ_HEARTBEAT_TTL.

You can also have a look at the HeartbeatsTest class. It’s unit test for zmq, not for org.zeromq but it might provide some light.

Read more comments on GitHub >

github_iconTop Results From Across the Web

PyZMQ req socket - hang on context.term() - Stack Overflow
If the server is not running, the client passes immediately through the first socket.poll call (since zmq just buffers the message internally).
Read more >
1. Basics | ØMQ - ZeroMQ Guide
The reasons are technical and painful, but the upshot is that if you leave any sockets open, the zmq_ctx_destroy() function will hang forever....
Read more >
The ZeroMQ Guide - for Python Developers
ØMQ (also known as ZeroMQ, ØMQ, or zmq) looks like an embeddable networking library but acts like a concurrency framework. It gives you...
Read more >
PyZMQ Documentation - Read the Docs
This prevents hangs caused by ctx.term() if sockets are left open, ... to a network peer, or the socket's linger period set with...
Read more >
Bug listing with status UNCONFIRMED as at 2022/12/14 12 ...
status:UNCONFIRMED resolution: severity:enhancement · Bug:372551 - "sys-apps/openrc: rc.c:read_key() hangs waiting for input at Interactive? prompt (termios ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found