CircuitBreaker permits more calls then expected when switching from OPEN to HALF_OPEN state
See original GitHub issueResilience4j version: 1.7.0
Java version: Java16
CircuitBreaker permits more calls then expected when switching from OPEN
to HALF_OPEN
state.
It is most likely caused by CircuitBreakerStateMachine.OpenState#tryAcquirePermission
that returns true during OPEN
-> HALF_OPEN
state switch that is being performed on another thread. See this code:
public boolean tryAcquirePermission() {
// Thread-safe
if (clock.instant().isAfter(retryAfterWaitDuration)) {
toHalfOpenState();
return true;
}
circuitBreakerMetrics.onCallNotPermitted();
return false;
}
private void toHalfOpenState() {
if (isOpen.compareAndSet(true, false)) {
transitionToHalfOpenState();
}
}
clock.instant().isAfter(retryAfterWaitDuration)
is true
and toHalfOpenState()
returns because isOpen
is false
already.
The code like this:
public static void main(String[] args) throws InterruptedException {
// Create a custom configuration for a CircuitBreaker
CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
.failureRateThreshold(50)
.slowCallRateThreshold(50)
.waitDurationInOpenState(Duration.ofMillis(10000))
.slowCallDurationThreshold(Duration.ofSeconds(1))
.permittedNumberOfCallsInHalfOpenState(1)
.minimumNumberOfCalls(2)
.slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
.slidingWindowSize(2)
// .maxWaitDurationInHalfOpenState()
.build();
// Create a CircuitBreakerRegistry with a custom global configuration
CircuitBreakerRegistry circuitBreakerRegistry =
CircuitBreakerRegistry.of(circuitBreakerConfig);
// Get or create a CircuitBreaker from the CircuitBreakerRegistry
// with the global default configuration
CircuitBreaker circuitBreakerWithDefaultConfig =
circuitBreakerRegistry.circuitBreaker("name1");
Runnable runnable = circuitBreakerWithDefaultConfig.decorateRunnable(() -> {
log.info("called " + circuitBreakerWithDefaultConfig.getState());
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
//log.info("finished");
});
// spawn 10 threads
for (int i = 0; i < 10; i++) {
new Thread(() -> {
while(true) {
try {
runnable.run();
} catch (Exception e) {
}
}
}).start();
}
while (true) {
//log.info("state {}", circuitBreakerWithDefaultConfig.getState());
Thread.sleep(500);
}
}
will print e.g.:
2021-05-19 20:45:32,189 [Thread-0] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:32,189 [Thread-5] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:32,189 [Thread-6] INFO com.example.demo.ScheduledService - called CLOSED
2021-05-19 20:45:42,199 [Thread-5] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-2] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-1] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-9] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-7] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-8] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-0] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-3] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-6] INFO com.example.demo.ScheduledService - called OPEN
2021-05-19 20:45:42,199 [Thread-4] INFO com.example.demo.ScheduledService - called HALF_OPEN
2021-05-19 20:45:44,245 [Thread-0] INFO com.example.demo.ScheduledService - called OPEN
see it was called many times in a OPEN state. I would expect at most 1 call and with HALF_OPEN state (thanks to .permittedNumberOfCallsInHalfOpenState(1)
)
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
How's the behaviour of circuit breaker in HALF_OPEN state ...
If the failure rate or slow call rate is then equal or greater than the configured threshold, the state changes back to OPEN....
Read more >Resilience4j | Circuit breaker basics & runtime behavior/state ...
If service call is giving errors more than expected, stop calling that service for certain wait ... open state before moving to half...
Read more >CircuitBreaker - resilience4j
The CircuitBreaker rejects calls with a CallNotPermittedException when it is OPEN. After a wait time duration has elapsed, the CircuitBreaker state changes from ......
Read more >Improving Resilience Using Resilience4j - Second Edition
After a while, the circuit breaker will be half-open, allowing new calls to ... the circuit open for 10 seconds and then transition...
Read more >From Paris to Berlin — Creating Circuit-Breakers in Kotlin
A half-open state is when our circuit-breaker is instructed to perform ... CircuitBreaker 'TEST' is OPEN and does not permit further calls.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Released. Might take some more minutes. First release without Bintry.
@RobWin Any idea on when this fix can get released in Maven? We seem to be encountering the same issue. I used JMeter to send a bunch of calls to our backend service that logs a timestamp. The call is routed through a Spring Cloud Gateway (reactive) application.
Jmeter is showing “Service Unavailable,” but the calls are still getting through to the backend. When we look at the logs, this is what I see many times in a row:
CircuitBreaker 'DEMO-SCG-GET-ERROR' changed state from HALF_OPEN to OPEN
//occurs many times in a row in the logs