question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Client hangs when reconnecting

See original GitHub issue

Hello, I’m facing an issue where when the client tries to reconnect to the broker it hangs.

Application is:

  • ASP.NET Core 2.2
  • MQTTNet Version 3.0.2

To test what I’m doing I have a Mosquito MQTT message broker running locally in Docker which I can start (docker start <containerid>) and stop (docker stop <containerid>).

Here is my code:

public async Task ConnectAsync()
        {
            _mqttClient.UseConnectedHandler(e =>
            {
                _logger.LogWarning(EventIds.ConnectionEstablished, "Connection established to broker");
            });

            _mqttClient.UseDisconnectedHandler(async e =>
            {
                _logger.LogWarning(EventIds.Disconnected, "Disconnected from broker");

                if (_isDisconnecting) return;

                await Retry();  //.ConfigureAwait(false);
            });

            await _mqttClient.ConnectAsync(_clientOptions).ConfigureAwait(false);
        }

        private async Task Retry()
        {
            _logger.LogWarning(EventIds.Disconnected, "Retrying in 5 seconds");
            await Task.Delay(TimeSpan.FromSeconds(5));

            try
            {
                await _mqttClient.ConnectAsync(_clientOptions); //.ConfigureAwait(false);
            }
            catch
            {
                _logger.LogError("####### Reconnect failed #######");
            }
        }

When I start the app with the message broker off, it works as expected. The disconnect handler is fired every time it fails to connect, resulting in a 5 second delay and then another attempt at reconnecting. Switching the broker back on results in it reconnecting and goes ahead as normal, sending pings.

The problem comes when it has already established a connection and the broker is stopped. The disconnect handler is fired as expected, but when it tries to connect the application hangs.

Here are the logs:

[17:43:34.501 VRB] MqttNetLogEvent. LogId: null, Message: Stopped sending keep alive packets., Source: MqttClient, ThreadId: 4, Timestamp: 05/28/2019 16:43:34
[17:43:34.505 VRB] MqttNetLogEvent. LogId: null, Message: Disconnecting [Timeout=00:00:10], Source: MqttClient, ThreadId: 4, Timestamp: 05/28/2019 16:43:34
[17:43:34.509 VRB] MqttNetLogEvent. LogId: null, Message: Disconnected from adapter., Source: MqttClient, ThreadId: 4, Timestamp: 05/28/2019 16:43:34
[17:43:34.512 INF] MqttNetLogEvent. LogId: null, Message: Disconnected., Source: MqttClient, ThreadId: 4, Timestamp: 05/28/2019 16:43:34
[17:43:34.515 WRN] Disconnected from broker
[17:43:34.517 WRN] Retrying in 5 seconds
[17:43:39.532 VRB] MqttNetLogEvent. LogId: null, Message: Trying to connect with server 'localhost:1883' (Timeout=00:00:10)., Source: MqttClient, ThreadId: 8, Timestamp: 05/28/2019 16:43:39
[17:43:41.766 ERR] MqttNetLogEvent. LogId: null, Message: Error while connecting with server., Source: MqttClient, ThreadId: 5, Timestamp: 05/28/2019 16:43:41
MQTTnet.Exceptions.MqttCommunicationException: No connection could be made because the target machine actively refused it [::ffff:127.0.0.1]:1883 ---> System.Net.Internals.SocketExceptionFactory+ExtendedSocketException: No connection could be made because the target machine actively refused it [::ffff:127.0.0.1]:1883
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
   at System.Net.Sockets.Socket.DoMultipleAddressConnectCallback(Object result, MultipleAddressConnectAsyncResult context)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Sockets.Socket.DoMultipleAddressConnectCallback(Object result, MultipleAddressConnectAsyncResult context)
   at System.Net.Sockets.Socket.MultipleAddressConnectCallback(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
   at System.Net.Sockets.Socket.<>c.<ConnectAsync>b__274_0(IAsyncResult iar)
--- End of stack trace from previous location where exception was thrown ---
   at MQTTnet.Implementations.MqttTcpChannel.ConnectAsync(CancellationToken cancellationToken) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Implementations\MqttTcpChannel.cs:line 67
   at MQTTnet.Adapter.MqttChannelAdapter.<ConnectAsync>b__30_0(CancellationToken t) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Adapter\MqttChannelAdapter.cs:line 70
   at MQTTnet.Internal.MqttTaskTimeout.WaitAsync(Func`2 action, TimeSpan timeout, CancellationToken cancellationToken) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Internal\MqttTaskTimeout.cs:line 19
   at MQTTnet.Adapter.MqttChannelAdapter.ConnectAsync(TimeSpan timeout, CancellationToken cancellationToken) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Adapter\MqttChannelAdapter.cs:line 70
   --- End of inner exception stack trace ---
   at MQTTnet.Adapter.MqttChannelAdapter.WrapException(Exception exception) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Adapter\MqttChannelAdapter.cs:line 315
   at MQTTnet.Adapter.MqttChannelAdapter.ConnectAsync(TimeSpan timeout, CancellationToken cancellationToken) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Adapter\MqttChannelAdapter.cs:line 80
   at MQTTnet.Client.MqttClient.ConnectAsync(IMqttClientOptions options, CancellationToken cancellationToken) in C:\Users\lbostock\Source\Repos\MQTTnet\Source\MQTTnet\Client\MqttClient.cs:line 81
[17:43:41.818 VRB] MqttNetLogEvent. LogId: null, Message: Disconnecting [Timeout=00:00:10], Source: MqttClient, ThreadId: 5, Timestamp: 05/28/2019 16:43:41

I pulled in the source code from master for mqttnet and can see that it’s hanging at this call:

        private async Task DisconnectInternalAsync(Task sender, Exception exception, MqttClientAuthenticateResult authenticateResult)
        {
            var clientWasConnected = IsConnected;

            InitiateDisconnect();

            IsConnected = false;

            try
            {
                if (_adapter != null)
                {
                    _logger.Verbose("Disconnecting [Timeout={0}]", Options.CommunicationTimeout);
                    await _adapter.DisconnectAsync(Options.CommunicationTimeout, CancellationToken.None).ConfigureAwait(false);
                }

                /*HANGS HERE >>>>>*/ await WaitForTaskAsync(_packetReceiverTask, sender).ConfigureAwait(false);
                await WaitForTaskAsync(_keepAlivePacketsSenderTask, sender).ConfigureAwait(false);

                _logger.Verbose("Disconnected from adapter.");
            }
            catch (Exception adapterException)
            {
                _logger.Warning(adapterException, "Error while disconnecting from adapter.");
            }
            finally
            {
                Dispose();
                _cleanDisconnectInitiated = false;

                _logger.Info("Disconnected.");

                var disconnectedHandler = DisconnectedHandler;
                if (disconnectedHandler != null)
                {
                    await disconnectedHandler.HandleDisconnectedAsync(new MqttClientDisconnectedEventArgs(clientWasConnected, exception, authenticateResult)).ConfigureAwait(false);
                }
            }
        }

I’ll be the first to admit that I’m not completely comfortable with async/await, hence me trying .ConfigureAwait(false) on the calls, but I’m really at a loss.

I suppose it’s also worth noting, we have a spike which was written against MQTTNet 2.8.5, the code is near identical apart from it’s an event handler:

            _client.Disconnected += async (s, e) =>
            {
                OpenCircuitBreaker();
                
                // Disconnected event is fired when DisconnectAsync is called.  Don't want to reconnect when the service is stopping
                if (_isDisconnecting)
                {
                    return;
                }

                _logger.LogWarning(EventIds.ConnectionLost, "Connection lost. Attempting to reconnect in {ReconnectBackoffSeconds} milliseconds",
                    _config.ReconnectBackoffMilliseconds);

                await Task.Delay(TimeSpan.FromMilliseconds(_config.ReconnectBackoffMilliseconds));
                await ConnectInternalAsync();
            };

This works fine.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:21

github_iconTop GitHub Comments

2reactions
milamber-lscommented, Jul 1, 2019

@milamber-ls a better formatting would make your posts more readable, please take a look at here: https://help.github.com/en/articles/creating-and-highlighting-code-blocks

you are right! just edited the post

2reactions
paolofulgonicommented, Jun 30, 2019

@milamber-ls a better formatting would make your posts more readable, please take a look at here: https://help.github.com/en/articles/creating-and-highlighting-code-blocks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Client hangs when reconnecting · Issue #647
The disconnect handler is fired every time it fails to connect, resulting in a 5 second delay and then another attempt at reconnecting....
Read more >
Published application hangs during Session Reconnect
Disabling the EDT helped us to resolve the issue. Problem Cause. When the Client tries to resume the connection, it receives an error...
Read more >
Reconnect TcpClient to Server if it crashes
Oh, you mean you want to keep trying to reconnect? I'd think: try to reconnect -> fail -> pause -> repeat until successful...
Read more >
IT15408: JMS client hangs during reconnect to a queue ...
When using the MQ classes for JMS with the MQ Automatic Client Reconnect functionality, a hang can occur when a multi-instance.
Read more >
Windows Client stay reconnecting or hang after network ...
Most of cases, exiting the app and lauching it again do not resolve the problem, the client stay in the previous state. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found