question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sometimes internalDeleteTopicForcefully will block forever

See original GitHub issue

https://github.com/apache/pulsar/blob/d1fb88acace2ceefc234458d2c0d93f2844c7a54/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/PersistentTopicsBase.java#L312

"pulsar-web-44-4" #148 prio=5 os_prio=0 tid=0x00007fcbf7f89000 nid=0x6d97 waiting on condition [0x00007fcabc99e000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x000000073d14e618> (a java.util.concurrent.CompletableFuture$Signaller)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693)
	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
	at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalDeleteTopicForcefully(PersistentTopicsBase.java:379)
	at org.apache.pulsar.broker.admin.impl.PersistentTopicsBase.internalDeleteTopic(PersistentTopicsBase.java:961)
	at org.apache.pulsar.broker.admin.v2.PersistentTopics.deleteTopic(PersistentTopics.java:889)
	at sun.reflect.GeneratedMethodAccessor241.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$$Lambda$419/1877964230.invoke(Unknown Source)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167)
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234)
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680)
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
	at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1631)
	at org.apache.pulsar.broker.web.VpcLookupFilter.doFilter(VpcLookupFilter.java:148)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
	at org.apache.pulsar.broker.web.ResponseHandlerFilter.doFilter(ResponseHandlerFilter.java:66)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
	at org.apache.pulsar.broker.web.AuthenticationFilter.doFilter(AuthenticationFilter.java:82)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
	at org.apache.pulsar.broker.web.PreInterceptFilter.doFilter(PreInterceptFilter.java:68)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234)
	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
	at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:179)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.eclipse.jetty.server.Server.handle(Server.java:516)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:400)
	at org.eclipse.jetty.server.HttpChannel$$Lambda$537/2098987772.dispatch(Unknown Source)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:645)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:392)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)

in our scene, when we dump the thread many times , we can see this will wait forever

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12 (12 by maintainers)

github_iconTop GitHub Comments

2reactions
eolivellicommented, Feb 25, 2022

good work @michaeljmarshall

We should not call ‘join’ and chain the async functions calls to persistentTopicExists.

I agree that we should fix this.

0reactions
leizhiyuancommented, Mar 1, 2022

@leizhiyuan - after looking through your thread dump a bit more, I am wondering if you have a separate issue in your delayed message implementation? I see that thread is blocked on some other thread that is completing a lookup call (that lookup call has a timeout on it, so that thread isn’t indefinitely blocked). Is there any chance that the "DM-sync-delay-index" thread being blocked is causing an issue here?

"DM-sync-delay-index" #223 daemon prio=5 os_prio=0 tid=0x00007fcbd8023800 nid=0x7cb5 waiting for monitor entry [0x00007fcaa193f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1868)
        - waiting to lock <0x0000000760a001d8> (a java.util.concurrent.ConcurrentHashMap$ReservationNode)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2037)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2020)
        at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:112)
        at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:67)
        at com.tencent.tdmq.delaymessage.inner.MetadataCache.isDelayTenantOwner(MetadataCache.java:151)
        at com.tencent.tdmq.delaymessage.inner.SyncIndexService.syncTenant(SyncIndexService.java:65)
        at com.tencent.tdmq.delaymessage.inner.SyncIndexService.syncAllTenants(SyncIndexService.java:52)
        at com.tencent.tdmq.delaymessage.inner.SyncIndexService.lambda$new$1(SyncIndexService.java:41)
        at com.tencent.tdmq.delaymessage.inner.SyncIndexService$$Lambda$832/1549413052.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
"DM-clear-thread" #227 daemon prio=5 os_prio=0 tid=0x00007fcbd8035800 nid=0x7cb9 waiting on condition [0x00007fcaa153a000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000760a08240> (a java.util.concurrent.CompletableFuture$Signaller)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1695)
        at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
        at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1775)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
        at org.apache.pulsar.client.admin.internal.LookupImpl.lookupTopic(LookupImpl.java:56)
        at com.tencent.tdmq.delaymessage.inner.MetadataCache.lookup(MetadataCache.java:166)
        at com.tencent.tdmq.delaymessage.inner.MetadataCache$$Lambda$840/1429314972.load(Unknown Source)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:3366)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache$$Lambda$839/2030879142.apply(Unknown Source)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2039)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$88/947462790.apply(Unknown Source)
        at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
        - locked <0x0000000760a001d8> (a java.util.concurrent.ConcurrentHashMap$ReservationNode)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2037)
        at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2020)
        at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:112)
        at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:67)
        at com.tencent.tdmq.delaymessage.inner.MetadataCache.isDelayTenantOwner(MetadataCache.java:151)
        at com.tencent.tdmq.delaymessage.inner.ClearTask.check(ClearTask.java:67)
        at com.tencent.tdmq.delaymessage.inner.ClearTask$$Lambda$846/1309049241.run(Unknown Source)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Thanks for your help, but we analyzed this thread, DM-sync-delay-index will not affect the lookup process. It only uses cache to lookup if a topic exists, if cache missed ,It will lookup use PulsarService

image

when cache missed, we do this. image

Read more comments on GitHub >

github_iconTop Results From Across the Web

When/how does a topic "marked for deletion" get finally ...
After being patient for a week, I assume the answer in the linked comment is in fact correct: If "delete.topic.enable = true" wasn't...
Read more >
Topic Deletion · The Internals of Apache Kafka
Topic deletion is controlled by delete.topic.enable Kafka property that turns it on when true . Start a Kafka broker with broker ID 100...
Read more >
Manually delete Apache Kafka topics | by Sunny Srinidhi
After you issue the delete command, the topic will be “marked for deletion,' and you'll have to wait till it gets deleted. Sometimes,...
Read more >
Attempting to delete a topic which has recently been created ...
Topic deletion should succeed if previous topic creation succeeded, no matter how recently a topic was created. This is probably more important: all...
Read more >
Unable to Delete a Project or Namespace in OCP
In many cases, deleting resources that could not be deleted in the first place ultimately leads to project to be no longer stuck...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found