question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Thread BLOCKED issues of DestinationCache in DefaultSubscriptionRegistry

See original GitHub issue

We are using the spring-messaging to implement a STOMP server (using SimpleBrokerMessageHandler). The Client will subscribe on 5 channel and everything is ok when there are only a few users. However, when the online user is above ~ 700, the websocket channel is “out of response”.

After analysis, I found many other thread has been “BLOCKED” by DestinationCache, as follows:

"pk-ws-worker-100-thread-78" #560 prio=5 os_prio=0 tid=0x00007ff19c182000 nid=0x1252 waiting for monitor entry [0x00007ff0a8d9f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.springframework.messaging.simp.broker.DefaultSubscriptionRegistry$DestinationCache.getSubscriptions(DefaultSubscriptionRegistry.java:269)
        - waiting to lock <0x00000004c007ec20> (a org.springframework.messaging.simp.broker.DefaultSubscriptionRegistry$DestinationCache$1)

And part of the code are as follows:

public LinkedMultiValueMap<String, String> getSubscriptions(String destination, Message<?> message) {
  LinkedMultiValueMap<String, String> result = this.accessCache.get(destination);
  if (result == null) {
    synchronized (this.updateCache) {
      result = new LinkedMultiValueMap<>();
      for (SessionSubscriptionInfo info : subscriptionRegistry.getAllSubscriptions()) {
        for (String destinationPattern : info.getDestinations()) {
          if (getPathMatcher().match(destinationPattern, destination)) {
            for (Subscription sub : info.getSubscriptions(destinationPattern)) {
              result.add(info.sessionId, sub.getId());
            }
          }
        }
      }
      if (!result.isEmpty()) {
        this.updateCache.put(destination, result.deepCopy());
        this.accessCache.put(destination, result);
      }
    }
  }
  return result;
}

As you can see, the code inside synchronized will traverse all subscription, which will cost too much time and block other Thread.

Also, the accessCache / updateCache is not works if the client has not success make the subscription, which will make the situation worse.

We try to increase the cache limit and it does’t work for our case.

To solve the problem, we remove the DestinationCache and reimplement an Map -> <sessionId -> subsId> inside SessionSubscriptionRegistry. (in our own codebase of course)

After theses change, the server can handle more than 5K online users with no problem.

Meanwhile, I noticed that DefaultSubscriptionRegistry and DestinationCache has been there for many years.

So, I just wonder is it ok to make a pr? Or the existing DestinationCache is good for some other reason?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
trim09commented, Mar 26, 2020

We observed the same. The CPU was BLOCKED on synchronized blocks in DefaultSubscriptionRegistry causing a bottleneck. We were lucky, that we had no subscription pattern matching so we was able to solve it by reimplemeting DefaultSubscriptionRegistry. We used two concurrentMaps and it’s.computeX() methods:

public class CustomSubscriptionRegistry extends AbstractSubscriptionRegistry {

    private static final MultiValueMap<String, String> EMPTY_MAP = CollectionUtils.unmodifiableMultiValueMap(new LinkedMultiValueMap<>());

    // sessionId -> (subscriptionId -> destination)
    private final ConcurrentMap<String, Map<String, String>> sessions = new ConcurrentHashMap<>();

    // destination -> (session id -> List of subscriptionIds)
    private final ConcurrentMap<String, MultiValueMap<String, String>> destinationLookup = new ConcurrentHashMap<>();

    @Override
    protected void addSubscriptionInternal(@NonNull String sessionId, @NonNull String subscriptionId,
                                           @NonNull String destination, @NonNull Message<?> message) {
        sessions.compute(sessionId, (s, map) -> {
            if (map == null) {
                map = new HashMap<>();
            }
            map.put(subscriptionId, destination);
            addToDestinationLookup(sessionId, subscriptionId, destination);
            return map;
        });
    }

    @Override
    protected void removeSubscriptionInternal(@NonNull String sessionId, @NonNull String subscriptionId, @NonNull Message<?> message) {
        sessions.computeIfPresent(sessionId, (s, map) -> {
            String destination = map.remove(subscriptionId);

            if (destination != null) {
                removeFromDestinationLookup(sessionId, subscriptionId, destination);
            } else {
                log.trace("Could not remove websocket subscription. Subscription '{}' was not found for session '{}'",
                    subscriptionId, sessionId);
            }

            return emptyMapToNull(map);
        });
    }

    @Override
    public void unregisterAllSubscriptions(@NonNull String sessionId) {
        Map<String, String> map = sessions.remove(sessionId);

        if (map == null) {
            log.error("Could not unregister websocket session. Session '{}' was not found.", sessionId);
            return;
        }

        map.values().forEach(destination ->
            removeFromDestinationLookup(sessionId, destination));
    }

    @Override
    protected @NonNull MultiValueMap<String, String> findSubscriptionsInternal(@NonNull String destination, @NonNull Message<?> message) {
        return destinationLookup.getOrDefault(destination, EMPTY_MAP);
    }

    private void addToDestinationLookup(@NonNull String sessionId, @NonNull String subscriptionId, @NonNull String destination) {
        destinationLookup.compute(destination, (s, map) -> {
            if (map == null) {
                map = new LinkedMultiValueMap<>();
            }
            map.add(sessionId, subscriptionId);
            return map;
        });
    }

    private void removeFromDestinationLookup(@NonNull String sessionId, @NonNull String subscriptionId, String destination) {
        destinationLookup.computeIfPresent(destination, (dest, map) -> {
            map.computeIfPresent(sessionId, (s, subscriptions) -> {
                subscriptions.remove(subscriptionId);
                if (subscriptions.isEmpty()) {
                    return null;
                } else {
                    return subscriptions;
                }
            });

            return emptyMapToNull(map);
        });
    }

    private void removeFromDestinationLookup(@NonNull String sessionId, String destination) {
        destinationLookup.computeIfPresent(destination, (dest, map) -> {
            map.remove(sessionId);
            return emptyMapToNull(map);
        });
    }

    private <V, K, T extends Map<V, K>> T emptyMapToNull(T map) {
        return map.isEmpty() ? null : map;
    }
}

And registered it like this:

@Configuration
public class CustomBrokerMessageHandlerConfiguration extends DelegatingWebSocketMessageBrokerConfiguration {

    @Override
    @Bean
    public AbstractBrokerMessageHandler simpleBrokerMessageHandler() {
        SimpleBrokerMessageHandler handler = (SimpleBrokerMessageHandler) super.simpleBrokerMessageHandler();

        if (handler != null) {
            handler.setSubscriptionRegistry(new CustomSubscriptionRegistry());
        }

        return handler;
    }
}

It more than doubled the performance of a simple broker.

0reactions
alienistycommented, May 27, 2021

Apparently you can have mutliple subscriptions with the same subscription-id and this change breaks that requirement. Specifically UserDestinationMessageHandler will reuse the same topic subscription message to create the session specific subscription to that topic, which means that according to which thread gets to insert the subscription first wins.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DefaultSubscriptionRegistry not adapt high concurrency and ...
Some situations will cause many threads blocked on this function. One is pattern destination, which is not in accessCache. Another situation is ...
Read more >
DefaultSubscriptionRegistry.DestinationCache - Spring
Map from destination to <sessionId, subscriptionId> with locking. Constructor Summary. Constructors. Modifier, Constructor and Description.
Read more >
cleaner thread blocking — oracle-tech
We have been using je 3.2.21 for over a year. For the past several months, we experienced sporadic cleaner thread blocking issue.
Read more >
Identifying and Diagnosing a Thread Blocking Issue in the JVM
Here, we will see how the eG JVM Monitor instantly identifies blocked threads, and intelligently diagnoses the reason for the blockage. If one/more...
Read more >
Blocked thread information - IBM
Use this information to solve problems with blocked threads. Information about the state of a thread can be found in the THREADS section...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found