question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Discuss] Pulsar client : Connect Command add `keep_alive_interval`

See original GitHub issue

Motivation

When investigating #13342, we found that both the client and the server have the keepAliveIntervalSeconds configuration, which is 30s by default. During the configured time, the channel will send ping/pong commands to maintain connection availability. If the pong command is not replied within the cycle, the channel will be closed. For the client side, the reconnect logic is triggered after the channel is closed. For the broker side, the broker will clear the producer information after the channel is inactive. For the problem of #13342, it is because the user changed the configuration on the broker side to 100s. When the client determines that the connection has timed out and needs to disconnect the channel, since the client to the broker passes through the firewall, the close of the channel may not be sent, and then the client reconnects to the broker, and the reconnection succeeds. However, the timeout setting of the broker is relatively large. If the previous channel is not closed, the producer information will not be cleared. The reconnection of the producer will cause the broker to throw the exception that the producer already exists. This is the cause of the #13342 issue, and by tweaking the code, the issue can be reproduced.

What I want to discuss is whether we can optimize this, configure this value only on the client-side, and pass it to the broker through the connect command. The advantage is that the server can cancel this configuration, using client-side value instead, and multiple clients can configure different values.

API Changes

Add keep_alive_interval in CommandConnect:

message CommandConnect {
     ...
     optional int32 keep_alive_interval = 11 [default = 30];
}

The original logic to check the keep-alive is in PulsarHandler#handleKeepAliveTimeout which begins at channel active.

For broker-side:

  • Now we do this in ServerCnx#completeConnect

For client-side:

  • Now we do this in ClientCnx#handleConnected

Compatibility

no compatibility issues

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:10 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
lhotaricommented, Jun 2, 2022

Why should be separated? This value is based on the channel according to the client configuration, is there any problem? Why separate configuration?

Since there isn’t any problem that needs to be solved. Making the broker keep alive interval match the client’s keep alive interval isn’t solving a real problem. The broker side keep alive interval should be kept at 30 seconds and there’s nothing to change. We could simply document this behavior that if you increase the keep alive interval, there’s a risk that connections are orphaned until the keep alive check runs.

0reactions
github-actions[bot]commented, Jul 5, 2022

The issue had no activity for 30 days, mark with Stale label.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pulsar-client-go/connection.go at master - GitHub
Contribute to apache/pulsar-client-go development by creating an account on GitHub. ... Add(c.keepAliveInterval)). cmdConnect := &pb.CommandConnect{.
Read more >
Apache Pulsar 2.10.0
Remove -XX:-ResizePLAB JVM option which degrades performance on JDK11 #12940 · Enable TCP keepAlive flag on the sockets #12982 · Reduce the time ......
Read more >
ClientBuilder (Pulsar Client :: API 2.4.0 API)
Configure whether the Pulsar client accept untrusted TLS certificate from broker (default: ... Set keep alive interval for each client-broker-connection.
Read more >
Pulsar command-line tools
broker-tool. Important. This page only shows some frequently used commands. For the latest information about pulsar , pulsar-client ...
Read more >
Pulsar configuration
Name Description Default exposePublisherStats Whether to enable topic level metrics. true statsUpdateFrequencyInSecs 60 statsUpdateInitialDelayInSecs 60
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found