Websocket reconnect issue with multiple clients: getting ghost clients
See original GitHub issueDefect
Make sure that these boxes are checked before submitting your issue – thank you!
- Included
nats-server -DV
output - Included a [Minimal, Complete, and Verifiable example] (https://stackoverflow.com/help/mcve)
Versions of nats-server
and affected client libraries used:
nats-server version 2.2.0-beta.34 synadia/nats-server:nightly-20201204
OS/Container environment:
Linux / Docker container
Steps or code to reproduce the issue:
Create two basic nats.ws client with the following connection options:
{
name: 'client-',
servers: 'ws://localhost:9222',
token: '3secret',
timeout: 2000,
noEcho: true,
maxReconnectAttempts: 10,
pingInterval: 3000,
maxPingOut: 2,
}
Create a nats server with the following Docker image: synadia/nats-server:nightly-20201204 and the following config:
listen: 127.0.0.1:4222
debug: false
trace: true
authorization: {
token: "3secret"
}
websocket: {
listen: 0.0.0.0:9222
no_tls: true
}
- Start the server
- Connect client 1
- Connect client 2
- Wait 5s
- Kill the server
- Wait 5s
- Start the server
This will help reproduction: issue-1778.zip
Expected result:
The server starts, both client try to reconnect. The server shows two “client connection created” and two “CONNECT” messages which authenticates fine. The clients get a “reconnect” status message.
[1] 2020/12/17 16:03:02.342424 [INF] Starting nats-server version 2.2.0-beta.34
[1] 2020/12/17 16:03:02.342526 [DBG] Go build version go1.14.12
[1] 2020/12/17 16:03:02.342532 [INF] Git commit [8e095759]
[1] 2020/12/17 16:03:02.342563 [INF] Using configuration file: /nats/server.conf
[1] 2020/12/17 16:03:02.342615 [DBG] Created system account: "$SYS"
[1] 2020/12/17 16:03:02.343162 [INF] Listening for websocket clients on ws://0.0.0.0:9222
[1] 2020/12/17 16:03:02.343199 [WRN] Websocket not configured with TLS. DO NOT USE IN PRODUCTION!
[1] 2020/12/17 16:03:02.343208 [DBG] Get non local IPs for "0.0.0.0"
[1] 2020/12/17 16:03:02.343425 [DBG] ip=172.17.0.2
[1] 2020/12/17 16:03:02.343543 [INF] Listening for client connections on 127.0.0.1:4222
[1] 2020/12/17 16:03:02.343579 [INF] Server id is NAZRKRPGYC56ZINY76B4R6OTGSKMI2VZB5KO4AVCYDD7AGSNH54QJAIF
[1] 2020/12/17 16:03:02.343584 [INF] Server name is NAZRKRPGYC56ZINY76B4R6OTGSKMI2VZB5KO4AVCYDD7AGSNH54QJAIF
[1] 2020/12/17 16:03:02.343588 [INF] Server is ready
[1] 2020/12/17 16:03:02.637848 [DBG] 172.17.0.1:45256 - wid:2 - Client connection created
[1] 2020/12/17 16:03:02.642905 [TRC] 172.17.0.1:45256 - wid:2 - <<- [CONNECT {"protocol":1,"version":"1.0.0-114","lang":"nats.ws","echo":false,"verbose":false,"pedantic":false,"name":"client-","auth_token":"3secret"}]
[1] 2020/12/17 16:03:02.643120 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:02.643167 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:02.648342 [DBG] 172.17.0.1:45260 - wid:3 - Client connection created
[1] 2020/12/17 16:03:02.654492 [TRC] 172.17.0.1:45260 - wid:3 - <<- [CONNECT {"protocol":1,"version":"1.0.0-114","lang":"nats.ws","echo":false,"verbose":false,"pedantic":false,"name":"client-","auth_token":"3secret"}]
[1] 2020/12/17 16:03:02.654558 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:02.654564 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:04.752051 [DBG] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - Client Ping Timer
[1] 2020/12/17 16:03:04.752120 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - ->> [PING]
[1] 2020/12/17 16:03:04.756260 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - <<- [PONG]
[1] 2020/12/17 16:03:04.896866 [DBG] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - Client Ping Timer
[1] 2020/12/17 16:03:04.896939 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - ->> [PING]
[1] 2020/12/17 16:03:04.902107 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - <<- [PONG]
[1] 2020/12/17 16:03:05.666494 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:05.666565 [TRC] 172.17.0.1:45256 - wid:2 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:06.516147 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:06.516183 [TRC] 172.17.0.1:45260 - wid:3 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
Actual result:
Sometime, the expected result occurs, sometimes, we can see 4 “Client connection created” messages with 4 different wid. Only two CONNECT messages are received. The two connections that don’t have a CONNECT message get an Authentication Timeout message (result of auth enabled). The clients get a “NATS connection closed ‘Authentication Timeout’” failure.
[1] 2020/12/17 16:03:13.638188 [INF] Starting nats-server version 2.2.0-beta.34
[1] 2020/12/17 16:03:13.638290 [DBG] Go build version go1.14.12
[1] 2020/12/17 16:03:13.638296 [INF] Git commit [8e095759]
[1] 2020/12/17 16:03:13.638301 [INF] Using configuration file: /nats/server.conf
[1] 2020/12/17 16:03:13.638331 [DBG] Created system account: "$SYS"
[1] 2020/12/17 16:03:13.638993 [INF] Listening for websocket clients on ws://0.0.0.0:9222
[1] 2020/12/17 16:03:13.639030 [WRN] Websocket not configured with TLS. DO NOT USE IN PRODUCTION!
[1] 2020/12/17 16:03:13.639039 [DBG] Get non local IPs for "0.0.0.0"
[1] 2020/12/17 16:03:13.639426 [DBG] ip=172.17.0.2
[1] 2020/12/17 16:03:13.639545 [INF] Listening for client connections on 127.0.0.1:4222
[1] 2020/12/17 16:03:13.639577 [INF] Server id is NAOYR6QS2GMEQG4TFZF7D3SNFWEL536KSW57ZJUTSPEMZ5UGFF6UBXHM
[1] 2020/12/17 16:03:13.639582 [INF] Server name is NAOYR6QS2GMEQG4TFZF7D3SNFWEL536KSW57ZJUTSPEMZ5UGFF6UBXHM
[1] 2020/12/17 16:03:13.639586 [INF] Server is ready
[1] 2020/12/17 16:03:13.931292 [DBG] 172.17.0.1:45276 - wid:2 - Client connection created
[1] 2020/12/17 16:03:13.948180 [DBG] 172.17.0.1:45280 - wid:3 - Client connection created
[1] 2020/12/17 16:03:13.960071 [DBG] 172.17.0.1:45284 - wid:4 - Client connection created
[1] 2020/12/17 16:03:13.988728 [DBG] 172.17.0.1:45288 - wid:5 - Client connection created
[1] 2020/12/17 16:03:13.993530 [TRC] 172.17.0.1:45288 - wid:5 - <<- [CONNECT {"protocol":1,"version":"1.0.0-114","lang":"nats.ws","echo":false,"verbose":false,"pedantic":false,"name":"client-","auth_token":"3secret"}]
[1] 2020/12/17 16:03:13.993665 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:13.993701 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:15.932328 [TRC] 172.17.0.1:45276 - wid:2 - ->> [-ERR Authentication Timeout]
[1] 2020/12/17 16:03:15.932381 [DBG] 172.17.0.1:45276 - wid:2 - Authentication Timeout
[1] 2020/12/17 16:03:15.932411 [DBG] 172.17.0.1:45276 - wid:2 - Client connection closed: Authentication Timeout
[1] 2020/12/17 16:03:15.948879 [TRC] 172.17.0.1:45280 - wid:3 - ->> [-ERR Authentication Timeout]
[1] 2020/12/17 16:03:15.948917 [DBG] 172.17.0.1:45280 - wid:3 - Authentication Timeout
[1] 2020/12/17 16:03:15.948946 [DBG] 172.17.0.1:45280 - wid:3 - Client connection closed: Authentication Timeout
[1] 2020/12/17 16:03:15.960466 [TRC] 172.17.0.1:45284 - wid:4 - ->> [-ERR Authentication Timeout]
[1] 2020/12/17 16:03:15.960566 [DBG] 172.17.0.1:45284 - wid:4 - Authentication Timeout
[1] 2020/12/17 16:03:15.960575 [DBG] 172.17.0.1:45284 - wid:4 - Client connection closed: Authentication Timeout
[1] 2020/12/17 16:03:15.971841 [DBG] 172.17.0.1:45294 - wid:6 - Client connection created
[1] 2020/12/17 16:03:15.976952 [TRC] 172.17.0.1:45294 - wid:6 - <<- [CONNECT {"protocol":1,"version":"1.0.0-114","lang":"nats.ws","echo":false,"verbose":false,"pedantic":false,"name":"client-","auth_token":"3secret"}]
[1] 2020/12/17 16:03:15.995103 [DBG] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - Client Ping Timer
[1] 2020/12/17 16:03:15.995162 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - ->> [PING]
[1] 2020/12/17 16:03:15.998132 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - <<- [PONG]
[1] 2020/12/17 16:03:16.021763 [TRC] 172.17.0.1:45294 - wid:6 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:16.021813 [TRC] 172.17.0.1:45294 - wid:6 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:17.017397 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - <<- [PING]
[1] 2020/12/17 16:03:17.017435 [TRC] 172.17.0.1:45288 - wid:5 - "v1.0.0-114:nats.ws:client-" - ->> [PONG]
[1] 2020/12/17 16:03:18.234340 [DBG] 172.17.0.1:45294 - wid:6 - "v1.0.0-114:nats.ws:client-" - Client Ping Timer
[1] 2020/12/17 16:03:18.234407 [TRC] 172.17.0.1:45294 - wid:6 - "v1.0.0-114:nats.ws:client-" - ->> [PING]
[1] 2020/12/17 16:03:18.236898 [TRC] 172.17.0.1:45294 - wid:6 - "v1.0.0-114:nats.ws:client-" - <<- [PONG]
If the problem doesn’t occur the first time, you can just go on with step 5-7. I usually have the problem after 2-3 tries. It may be related to the timing.
Issue Analytics
- State:
- Created 3 years ago
- Comments:20 (13 by maintainers)
Top GitHub Comments
@mullerch Thank you very much for your detailed last test case - I was able to reproduce the issue. I will publish the fix as soon as it is merged into master.
npm install nats.ws@latest to get the new changes