question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Broker can not respond to client requests

See original GitHub issue

Describe the bug

One of the brokers in our Pulsar cluster suddenly hung up. The broker process was alive, but it seems that it could not respond to requests from clients.

At that time, the following errors occurred on the client side:

02:35:08.736 [pulsar-client-io-5-8] WARN  o.a.p.c.i.BinaryProtoLookupService   - [persistent://xxx/global/xxx/xxx-partition-7] failed to send lookup request : org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 9139700 lookup request timedout after ms 30000
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException$TimeoutException: 9139700 lookup request timedout after ms 30000
02:35:17.739 [pulsar-client-io-5-8] WARN  o.a.pulsar.common.api.PulsarHandler  - [[id: 0x678cbfd2, L:/xxx.xxx.xxx.xxx:36338 - R:f1c2-broker103.pulsar.xxx.yahoo.co.jp/xxx.xxx.xxx.xxx:6651]] Forcing connection to close after keep-alive timeout
02:35:17.748 [pulsar-client-io-5-8] WARN  o.a.pulsar.client.impl.ConsumerImpl  - [persistent://xxx/global/xxx/xxx][sub1] Failed to subscribe to topic on f1c2-broker103.pulsar.xxx.yahoo.co.jp/xxx.xxx.xxx.xxx:6651
02:35:33.951 [pulsar-client-io-5-8] WARN  o.a.pulsar.client.impl.ClientCnx     - Error during handshake
javax.net.ssl.SSLException: handshake timed out
        at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source) ~[netty-all-4.1.22.Final.jar:4.1.22.Final]
02:35:33.952 [pulsar-client-io-5-8] WARN  o.a.p.client.impl.ConnectionPool     - [[id: 0x208e5fb2, L:/xxx.xxx.xxx.xxx:60874 ! R:f1c2-broker103.pulsar.xxx.yahoo.co.jp/xxx.xxx.xxx.xxx:6651]] Connection handshake failed: org.apache.pulsar.client.api.PulsarClientException: Connection already closed
02:35:33.952 [pulsar-client-io-5-8] WARN  o.a.p.client.impl.ConnectionHandler  - [persistent://xxx/global/xxx/xxx] [xxx-497-2225404] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: Connection already closed
02:35:33.952 [pulsar-client-io-5-8] WARN  o.a.p.client.impl.ConnectionHandler  - [persistent://xxx/global/xxx/xxx] [xxx-497-2225404] Could not get connection to broker: org.apache.pulsar.client.api.PulsarClientException: Connection already closed -- Will try again in 0.188 s

That broker returned to normal after restarting, and the errors no longer occur on the client side. The load of that broker was not high, so I think that there is a bug in broker code.

Additional context

Broker OS: CentOS Linux release 7.6.1810 Broker version: 2.2.1 Broker spec: Real server / 2.10GHz / 2CPU / 256GBMEM / SATA SSD 240GB x1 / 10G Base-T*2port

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:25 (20 by maintainers)

github_iconTop GitHub Comments

1reaction
sijiecommented, Nov 16, 2020

@wolfstudy @rdhabalia let’s create a fix on branch-2.6 and create a 2.6.3 release.

0reactions
kwenzhcommented, Mar 24, 2022

We are hitting a similar issue , Proxy stops working for proxying Broker connections while Admin API proxying keeps working. but directly connect pulsar-broker is normally, proxy back to normal when i restart pulsar-proxy

The proxy logs are filled with this type of warnings:

13:13:09.996 [pulsar-proxy-io-2-1] WARN org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x83e12747, L:/ip:port - R:/ip:port]] Pulsar Handshake was not completed within timeout, closing connection
Read more comments on GitHub >

github_iconTop Results From Across the Web

Brokers Not Returning My Calls/Emails: How Do I Get Them ...
My clients know that they should follow this guideline, First, Text me for most urgent response. If it is an important issue, contains...
Read more >
The Broker does not respond - IBM
A broker might not respond to agent requests for several reasons, such as: It cannot communicate with its User Name Server. It cannot...
Read more >
[GitHub] [pulsar] xbblq62 commented on issue #3630: Broker ...
[GitHub] [pulsar] xbblq62 commented on issue #3630: Broker can not respond to client requests · GitBox Mon, 16 Nov 2020 02:42:17 -0800.
Read more >
Why Can't I Connect to Kafka? | Troubleshoot Connectivity
The client then connects to one (or more) of the brokers returned in the first step as required. If the broker has not...
Read more >
Responses to Frequently Asked Questions about a Broker ...
[1] Scienter is not required to establish a Section 5 violation. ... which exempts “brokers' transactions, executed upon customers' orders ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found