question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Kafka Health Check readiness is always down until It's consumed the first time

See original GitHub issue

Describe the bug Given an application with the following extensions:

  • pom.xml:
         <dependency>
            <groupId>io.quarkus</groupId>
            <artifactId>quarkus-smallrye-reactive-messaging-kafka</artifactId>
        </dependency>

        <dependency>
            <groupId>io.quarkus</groupId>
            <artifactId>quarkus-kafka-streams</artifactId>
        </dependency>

That creates a custom topology:

@Produces
    public Topology buildTopology() {
        StreamsBuilder builder = new StreamsBuilder();

        JsonbSerde<LoginAttempt> loginAttemptSerde = new JsonbSerde<>(LoginAttempt.class);
        JsonbSerde<LoginAggregation> loginAggregationSerde = new JsonbSerde<>(LoginAggregation.class);

        builder.stream("from", Consumed.with(Serdes.String(), loginAttemptSerde))
                .groupByKey()
                .windowedBy(TimeWindows.of(Duration.ofSeconds(windowsLoginSec)))
                .aggregate(LoginAggregation::new,
                        (id, value, aggregation) -> aggregation.updateFrom(value),
                        Materialized.<String, LoginAggregation, WindowStore<Bytes, byte[]>> as(LOGIN_AGGREGATION_STORE)
                                .withKeySerde(Serdes.String())
                                .withValueSerde(loginAggregationSerde))
                .toStream()
                .filter((k, v) -> (v.getCode() == UNAUTHORIZED.getStatusCode() || v.getCode() == FORBIDDEN.getStatusCode()))
                .filter((k,v) -> v.getCount() > threshold)
                .to("target");

        return builder.build();
    }

Spite of the application is correctly working, the health check readiness says that the “target” is down:

{
    "status": "DOWN",
    "checks": [
        {
            "name": "SmallRye Reactive Messaging - readiness check",
            "status": "DOWN",
            "data": {
                "login-http-response-values": "[OK]",
                "login-denied": "[KO]"
            }
        },
        {
            "name": "Kafka Streams topics health check",
            "status": "UP",
            "data": {
                "available_topics": "login-http-response-values"
            }
        }
    ]
}

This was working fine prior to 1.12.0.Final.

Expected behavior Health check should be UP.

Actual behavior Health check is DOWN in 1.12.0.Final, 999-SNAPSHOT.

To Reproduce Steps to reproduce the behavior:

  1. git clone https://github.com/Sgitario/quarkus-examples
  2. cd reproducers/kafka-streams-reactive-messaging
  3. mvn clean install -Dquarkus.version=1.12.0.Final It fails because the test checks whether the health check is UP.

If we build the reproducer with mvn clean install -Dquarkus.version=1.11.5.Final, it works:

{
    "status": "UP",
    "checks": [
        {
            "name": "SmallRye Reactive Messaging - readiness check",
            "status": "UP",
            "data": {
                "login-http-response-values": "[OK]",
                "target": "[OK]"
            }
        },
        {
            "name": "Kafka Streams topics health check",
            "status": "UP",
            "data": {
                "available_topics": "login-http-response-values"
            }
        }
    ]
}
  • Quarkus version or git rev: 1.12.0.Final and 999-SNAPSHOT

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
cescoffiercommented, Mar 9, 2021

@Sgitario I added a note to the migration guide.

0reactions
Sgitariocommented, Mar 9, 2021

Yes, until we find a better way to implement the check.

Thanks for the update on docs. As this is a breaking change for users using Kafka on OpenShift/K8S, how should we state this change in the migration guide?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Kafka consumer health check - Stack Overflow
I tried to google this topic and what I saw was that health check for Kafka was removed from spring-actuator as not ready...
Read more >
Add readiness and liveness probe to Cluster controller #143
I'm thinking about exposing an HTTP endpoint from the cluster-controller (maybe using Vert.x Health Checks component for that).
Read more >
Kafka Consumer pod should not run until kafka pods are up.
If all the 3 kafka pods are up then only our main kafka conusmer application should start, if kafka pods are not up...
Read more >
SmallRye Health - Quarkus
Liveness checks are utilized to tell whether the application should be restarted and readiness checks are used to tell whether the application is...
Read more >
Monitoring application health by using health checks
Understanding health checks · A readiness probe determines if a container is ready to accept service requests. · A liveness probe determines if...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found