question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Log compaction fails due to timeout

See original GitHub issue

Describe the bug Log compaction fails due to a timeout.

To Reproduce

  1. Start pulsar standalone
  2. Create producer
  3. Produce messages at high throughput
  4. Trigger topic compaction
  5. Read compaction status

Detailed steps to reproduce:

  1. Clear all state
# rm -r data logs # !!! careful !!! this wipes all your topics etc.
  1. Start pulsar standalone
bin/pulsar standalone
  1. Create producer
PulsarClient pulsarClient = PulsarClient.builder()
    .serviceUrl("pulsar://192.168.11.22:6650")
    .build();
Producer<byte[]> producer = pulsarClient.newProducer()
    .topic("my-topic")
    .create();
  1. Produce messages at high throughput
int msgCount = 500000;
long before = System.currentTimeMillis();
for (int i = 0; i < msgCount; i++) {
    producer.newMessage()
            .key("asdf")
            .value(new byte[] {1, 2, 3, 4, 5})
            .sendAsync();
}
System.out.println("Sent " + msgCount + " messages in " + (System.currentTimeMillis() - before) + " msec");
  1. Trigger topic compaction
$ bin/pulsar-admin topics compact persistent://public/default/my-topic
Topic compaction requested for persistent://public/default/my-topic
  1. Read compaction status
$ bin/pulsar-admin topics compaction-status persistent://public/default/my-topic
Compaction is currently running

$ # WAIT FOR 10 SECONDS

$ bin/pulsar-admin topics compaction-status persistent://public/default/my-topic
Error in compaction
null

Reason: Error compacting: java.util.concurrent.TimeoutException: Timeout

Expected behavior Compaction and no error

Desktop (please complete the following information):

  • Pulsar version: 2.5.0 (tar.gz from website)
  • OS: Linux 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 unknown unknown GNU/Linux

Logs The complete logs from the start of compaction until the error was read via compaction status

09:47:54.409 [pulsar-web-57-8] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Starting Pulsar consumer status recorder with config: {
  "topicNames" : [ "persistent://public/default/my-topic" ],
  "topicsPattern" : null,
  "subscriptionName" : "__compaction",
  "subscriptionType" : "Exclusive",
  "receiverQueueSize" : 1000,
  "acknowledgementsGroupTimeMicros" : 100000,
  "negativeAckRedeliveryDelayMicros" : 60000000,
  "maxTotalReceiverQueueSizeAcrossPartitions" : 50000,
  "consumerName" : null,
  "ackTimeoutMillis" : 0,
  "tickDurationMillis" : 1000,
  "priorityLevel" : 0,
  "cryptoFailureAction" : "FAIL",
  "properties" : { },
  "readCompacted" : true,
  "subscriptionInitialPosition" : "Latest",
  "patternAutoDiscoveryPeriod" : 1,
  "regexSubscriptionMode" : "PersistentOnly",
  "deadLetterPolicy" : null,
  "autoUpdatePartitions" : true,
  "replicateSubscriptionState" : false,
  "resetIncludeHead" : false,
  "keySharedPolicy" : null
}
09:47:54.410 [pulsar-web-57-8] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Pulsar client config: {
  "serviceUrl" : "pulsar://localhost:6650",
  "authPluginClassName" : null,
  "authParams" : null,
  "operationTimeoutMs" : 30000,
  "statsIntervalSeconds" : 60,
  "numIoThreads" : 1,
  "numListenerThreads" : 1,
  "connectionsPerBroker" : 1,
  "useTcpNoDelay" : true,
  "useTls" : false,
  "tlsTrustCertsFilePath" : null,
  "tlsAllowInsecureConnection" : false,
  "tlsHostnameVerificationEnable" : false,
  "concurrentLookupRequest" : 5000,
  "maxLookupRequest" : 50000,
  "maxNumberOfRejectedRequestPerConnection" : 50,
  "keepAliveIntervalSeconds" : 30,
  "connectionTimeoutMs" : 10000,
  "requestTimeoutMs" : 60000,
  "initialBackoffIntervalNanos" : 100000000,
  "maxBackoffIntervalNanos" : 60000000000
}
09:47:54.411 [pulsar-web-57-8] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:47:54 +0900] "PUT /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 204 0 "-" "Pulsar-Java-v2.5.0" 6
09:47:54.411 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribing to topic on cnx [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650]
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Subscribing on topic persistent://public/default/my-topic / __compaction
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-__compaction] Rewind from 1544:100 to 1544:0
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][__compaction] Created new subscription for 1
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Created subscription on topic persistent://public/default/my-topic / __compaction
09:47:54.413 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 1
09:47:54.413 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Seek subscription to message id -1:-1:-1
09:47:54.415 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.Consumer - Disconnecting consumer: Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:47:54.416 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.AbstractDispatcherSingleActiveConsumer - Removing consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:47:54.416 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ClientCnx - [localhost/127.0.0.1:6650] Broker notification of Closed consumer: 1
09:47:54.416 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://public/default/my-topic] [__compaction] Closed connection [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650] -- Will try again in 0.1 s
09:47:54.416 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentSubscription - [persistent://public/default/my-topic][__compaction] Successfully disconnected consumers from subscription, proceeding with cursor reset
09:47:54.416 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic] Initiate reset position to 1544:-1 on cursor __compaction
09:47:54.424 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic] reset position to 1544:-1 before current read position 1544:0 on cursor __compaction
09:47:54.425 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] [persistent://public/default/my-topic][__compaction] Reset subscription to message id -1:-1
09:47:54.426 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Successfully reset subscription to message id -1:-1:-1
09:47:54.516 [pulsar-timer-87-1] INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://public/default/my-topic] [__compaction] Reconnecting after timeout
09:47:54.517 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribing to topic on cnx [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650]
09:47:54.517 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Subscribing on topic persistent://public/default/my-topic / __compaction
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-__compaction] Rewind from 1544:-1 to 1544:0
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][__compaction] Created new subscription for 1
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Created subscription on topic persistent://public/default/my-topic / __compaction
09:47:54.518 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 1
09:47:54.526 [pulsar-external-listener-86-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic] [__compaction] Could not get connection while getLastMessageId -- Will try again in 100 ms
09:47:54.527 [pulsar-external-listener-86-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Get topic last message Id
09:47:54.527 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Successfully getLastMessageId 1544:37344
09:47:54.527 [pulsar-client-io-85-1] INFO  org.apache.pulsar.compaction.TwoPhaseCompactor - Commencing phase one of compaction for persistent://public/default/my-topic, reading to 1544:37344:-1
09:47:57.416 [pulsar-web-57-12] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:47:57 +0900] "GET /admin/v2/persistent/public/functions/coordinate/stats HTTP/1.1" 200 887 "-" "Pulsar-Java-v2.5.0" 5
09:48:02.838 [pulsar-web-57-1] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:02 +0900] "GET /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 200 35 "-" "Pulsar-Java-v2.5.0" 2
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Closing consumer: 1
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.AbstractDispatcherSingleActiveConsumer - Removing consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Closed consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:48:04.646 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic] [__compaction] Closed consumer
09:48:05.941 [pulsar-web-57-3] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:05 +0900] "GET /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 200 79 "-" "Pulsar-Java-v2.5.0" 2
09:48:27.417 [pulsar-web-57-6] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:27 +0900] "GET /admin/v2/persistent/public/functions/coordinate/stats HTTP/1.1" 200 887 "-" "Pulsar-Java-v2.5.0" 5

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:15 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
KannarFrcommented, Feb 5, 2020

@sijie You are right, I will write an issue.

1reaction
pienio7commented, Jan 31, 2020
Read more comments on GitHub >

github_iconTop Results From Across the Web

Kafka Log Compaction not starting - Stack Overflow
Records won't get compacted until after this period. The setting gives consumers time to get every record.
Read more >
test/format compaction doesn't handle timeout error return
mahajan, the problem here is there's a default timeout for compaction of 1200 seconds. The test/format program wasn't configuring a compaction ...
Read more >
Back up and Compaction of LQE database fails or take a long ...
Attempts to back up or compact the Lifecycle Query Engine (LQE) in IBM Jazz Reporting Service (JRS) is failing and no error is...
Read more >
Kafka 3.3 Documentation
The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data. The log compaction...
Read more >
Slowness in the UI loading or recommendation jobs failing ...
Druid compaction job failures for context tables in Druid log files. ... Recommendation jobs may timeout or fail due to worse query ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found