question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make composite healthcheck for ServiceManager

See original GitHub issue

Per #5850 we should have a /health endpoint for our service state. Composite services like ServiceManager should report their individual components, manifesting an aggregate healthy IFF all are healthy. This helps ensure traffic doesn’t pass through when things aren’t finished booting up, or if they crash after the fact.

This was discussed in https://apache-pinot.slack.com/archives/CDRCA57FC/p1598498630061400

Make a composite health endpoint similar to this:

  @GET
  @Produces(MediaType.APPLICATION_JSON)
  @Path("/instances")
  @ApiOperation(value = "Get Pinot Instances Status")
  @ApiResponses(value = {@ApiResponse(code = 200, message = "Instance Status"), @ApiResponse(code = 500, message = "Internal server error")})
  public Map<String, PinotInstanceStatus> getPinotAllInstancesStatus() {
    Map<String, PinotInstanceStatus> results = new HashMap<>();
    for (String instanceId : _pinotServiceManager.getRunningInstanceIds()) {
      results.put(instanceId, _pinotServiceManager.getInstanceStatus(instanceId));
    }
    return results;
  }

and in doing so it similar to discovering and calling each of https://github.com/apache/incubator-pinot/pull/5846

main thing is to represent the composite of its health so you can know if the process should be in service or not.

for example, i noticed one part of process fail in docker due to zip extraction maybe take too long no idea. still passes health check! that’s bad as it fails other thing.

cc @daniellavoie @fx19880617

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
codefromthecryptcommented, Aug 31, 2020

no I haven’t seen partial failure scenario using this. I will switch to it. meanwhile, it might be work back filling a test that proves PinotServiceManagerHealthCheck gives 503 on partial failure or if already does, link that and close this out.

0reactions
codefromthecryptcommented, Sep 1, 2020

it was a jar error ultimately during bootstrap which no longer exists (after extracting all of them). sadly I don’t have a copy of the message. the surprise was that the listener still worked heh. I think this error will be less possible after recent commit which checks exception and boolean status strictly.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Make composite healthcheck for ServiceManager · Issue #5950
My initial thoughts was that user expects all the components to be up to consider the standalone process is healthy. So I put...
Read more >
About User-Defined Composite Health Checks - TechDocs
You can create a composite health check to combine the results of multiple health checks. A composite health check can contain any number...
Read more >
Healthchecks — oci 2.88.2 documentation
HealthChecksClientCompositeOperations, This class provides a wrapper around ... CreateHttpMonitorDetails, The request body used to create an HTTP monitor.
Read more >
Service Manager: Health Check - Documentation - Memset
The "Health Check" section of the Service Manager configuration enables and configures the health check that the Load Balancer can perform on the...
Read more >
Health Checks with Spring Boot - Reflectoring
So a composite health check made up of the health of dependent systems aggregated together gives a more complete view. A composite health...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found