Can't reconnect if the server was restarted
See original GitHub issueSituation: Server v1.3, with authentication, client v2.1.0. Initialization code:
Options.Builder builder = new Options.Builder()
.server(<...>)
.userInfo(<...>,<...>)
.maxReconnects(-1)
.pingInterval(Duration.ofMillis(PING_INTERVAL_MS));
Options opts = builder.build();
Nats nats = Nats.connect(opts);
Old behavior (Nats v1.0.4, Jnats v1.0.0): Server started->client connects->client pushes messages->server restarted->client tries to reconnect->ok->client pushes messages.
New behavior: Server started->client connects->client pushes messages->server restarted->client tries to reconnect->gets error->can’t push.
Server log:
[27118] 2018/09/12 12:33:30.236462 [DBG] 127.0.0.1:47154 - cid:10 - Client connection created [27118] 2018/09/12 12:33:30.253822 [ERR] 127.0.0.1:47154 - cid:10 - Authorization Error - User “” [27118] 2018/09/12 12:33:30.253869 [TRC] 127.0.0.1:47154 - cid:10 - <<- [-ERR Authorization Violation] [27118] 2018/09/12 12:33:30.253895 [DBG] 127.0.0.1:47154 - cid:10 - Client connection closed
So, it seems that client relies on authentication cache and doesn’t send credentials again.
Issue Analytics
- State:
- Created 5 years ago
- Comments:14 (7 by maintainers)
Top GitHub Comments
my fix provides a fast path for all the protocol messages until the reconnect process is complete. So all the subscriptions are set up before the protocol messages start using the same queue as the regular messages. This is definitely something that would be very hard to work around, but I think the changes in v2.1.1 (branch is there if you want to look), should fix the issue because they allow the connect, ping and subscribe to jump the queue.
I checked if I could hold publisher from reconnecting by waiting on a semaphore in event listener, but it didn’t work - reconnecting happens in another thread.