client doesn't disconnect when ping task exits
See original GitHub issueHi,
I’m chasing some kind of deadlock that causes a client (a socketio client to be precise) not to return from its waiting state. So far I’ve only seen it a few times in my production environment when the wifi network was unstable.
On my development machine I’ve been trying to replicate the problem, but haven’t succeeded so far. While trying though, I ran into the following “similar” problematic behaviour, which might point to the same root cause … maybe 😃
So, a plain socketio client connects to a flask-socketio server and after connecting, I put the server to sleep (ctrl+z). The logs of the client then look like this:
[2019-07-11 21:35:54 +0200] [engineio.client] [5898] [INFO] Received packet PONG data None
[2019-07-11 21:35:54 +0200] [engineio.client] [5898] [INFO] Sending polling GET request to https://localhost:8000/socket.io/?transport=polling&EIO=3&sid=47cff736ceb549b9974fd0926f12ed2f
Here I put the server to sleep…
[2019-07-11 21:36:19 +0200] [engineio.client] [5898] [INFO] Sending packet PING data None
[2019-07-11 21:36:44 +0200] [engineio.client] [5898] [INFO] PONG response has not been received, aborting
[2019-07-11 21:36:44 +0200] [engineio.client] [5898] [INFO] Exiting ping task
And the client stays in this state endlessly. It seems to “think” it’s still connected, but the ping task has ended, while I would suspect that it comes to the conclusion it has lost its connection to the server and better considers itself disconnected, returning from the wait and return control to the application, that now can respond by retrying the connection or whatever 😉
Also, this little testing app sends out messages from a background task. Again, I would expect that trying to send such a message unsuccessfully would also make the client realise it is no longer actually connected. So after a long time, after the previous log excerpts I even have these:
[2019-07-11 21:50:54 +0200] [engineio.client] [5898] [INFO] Sending packet MESSAGE data 2["report","hello from demo-client"]
This is almost 15 minutes after the ping task exited, and even failing to actually send this message doesn’t make the client terminate its wait.
Now, somehow this will be technically correct 😉 but in a way this also seems “wrong” ?!
Issue Analytics
- State:
- Created 4 years ago
- Comments:10 (4 by maintainers)
(Sorry for the late reply, laptop was with service center to fix an issue under extended warranty.)
I’ve reviewed your commit and you applied the same fixes I applied to my custom class to introduce the timeouts. With that fix in place, I had +200 clients running for over two weeks, so it sure did the trick. In September, I continue to work on this project, so I can do more testing then. If any issue (re)occurs, I’ll follow up. But for now, I would surely consider it closed 😃
@miguelgrinberg, sorry for the late reply. I just tried testing with the master branch, seems the problem is fixed now. Disconnect event gets triggered and reconnects now, cool.