question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CircuitBreaker permits more calls then expected when switching from OPEN to HALF_OPEN state

See original GitHub issue

Resilience4j version: 1.7.0

Java version: Java16

CircuitBreaker permits more calls then expected when switching from OPEN to HALF_OPEN state.

It is most likely caused by CircuitBreakerStateMachine.OpenState#tryAcquirePermission that returns true during OPEN -> HALF_OPEN state switch that is being performed on another thread. See this code:

       public boolean tryAcquirePermission() {
            // Thread-safe
            if (clock.instant().isAfter(retryAfterWaitDuration)) {
                toHalfOpenState();
                return true;
            }
            circuitBreakerMetrics.onCallNotPermitted();
            return false;
        }
        private void toHalfOpenState() {
            if (isOpen.compareAndSet(true, false)) {
                transitionToHalfOpenState();
            }
        }

clock.instant().isAfter(retryAfterWaitDuration) is true and toHalfOpenState() returns because isOpen is false already.

The code like this:

    public static void main(String[] args) throws InterruptedException {
        // Create a custom configuration for a CircuitBreaker
        CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
                .failureRateThreshold(50)
                .slowCallRateThreshold(50)
                .waitDurationInOpenState(Duration.ofMillis(10000))
                .slowCallDurationThreshold(Duration.ofSeconds(1))
                .permittedNumberOfCallsInHalfOpenState(1)
                .minimumNumberOfCalls(2)
                .slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
                .slidingWindowSize(2)
//                .maxWaitDurationInHalfOpenState()
                .build();

        // Create a CircuitBreakerRegistry with a custom  global configuration
        CircuitBreakerRegistry circuitBreakerRegistry =
                CircuitBreakerRegistry.of(circuitBreakerConfig);

        // Get or create a CircuitBreaker from the CircuitBreakerRegistry
        // with the global default configuration
        CircuitBreaker circuitBreakerWithDefaultConfig =
                circuitBreakerRegistry.circuitBreaker("name1");

        Runnable runnable = circuitBreakerWithDefaultConfig.decorateRunnable(() -> {
            log.info("called " + circuitBreakerWithDefaultConfig.getState());
            try {
                Thread.sleep(2000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            //log.info("finished");
        });

        // spawn 10 threads
        for (int i = 0; i < 10; i++) {
            new Thread(() -> {
                while(true) {
                    try {
                        runnable.run();
                    } catch (Exception e) {

                    }
                }
            }).start();
        }


        while (true) {
            //log.info("state {}", circuitBreakerWithDefaultConfig.getState());
            Thread.sleep(500);
        }
    }

will print e.g.:

2021-05-19 20:45:32,189 [Thread-0] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:32,189 [Thread-5] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:32,189 [Thread-6] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:42,199 [Thread-5] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-2] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-1] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-9] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-7] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-8] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-0] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-3] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-6] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-4] INFO com.example.demo.ScheduledService - called HALF_OPEN
2021-05-19 20:45:44,245 [Thread-0] INFO com.example.demo.ScheduledService - called OPEN

see it was called many times in a OPEN state. I would expect at most 1 call and with HALF_OPEN state (thanks to .permittedNumberOfCallsInHalfOpenState(1))

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
RobWincommented, Jun 25, 2021

Released. Might take some more minutes. First release without Bintry.

0reactions
asharms26commented, Jun 24, 2021

@RobWin Any idea on when this fix can get released in Maven? We seem to be encountering the same issue. I used JMeter to send a bunch of calls to our backend service that logs a timestamp. The call is routed through a Spring Cloud Gateway (reactive) application.

Jmeter is showing “Service Unavailable,” but the calls are still getting through to the backend. When we look at the logs, this is what I see many times in a row:

CircuitBreaker 'DEMO-SCG-GET-ERROR' changed state from HALF_OPEN to OPEN //occurs many times in a row in the logs

Read more comments on GitHub >

github_iconTop Results From Across the Web

How's the behaviour of circuit breaker in HALF_OPEN state ...
If the failure rate or slow call rate is then equal or greater than the configured threshold, the state changes back to OPEN....
Read more >
Resilience4j | Circuit breaker basics & runtime behavior/state ...
If service call is giving errors more than expected, stop calling that service for certain wait ... open state before moving to half...
Read more >
CircuitBreaker - resilience4j
The CircuitBreaker rejects calls with a CallNotPermittedException when it is OPEN. After a wait time duration has elapsed, the CircuitBreaker state changes from ......
Read more >
Improving Resilience Using Resilience4j - Second Edition
After a while, the circuit breaker will be half-open, allowing new calls to ... the circuit open for 10 seconds and then transition...
Read more >
From Paris to Berlin — Creating Circuit-Breakers in Kotlin
A half-open state is when our circuit-breaker is instructed to perform ... CircuitBreaker 'TEST' is OPEN and does not permit further calls.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found