Health / liveness check for Worker
See original GitHub issueIs your feature request related to a problem? Please describe.
Since shifting to the standalone NestJS architecture for the worker in v1.0, we’ve lost the ability to use the HealthCheckRegistryService
to monitor the health of worker processes.
There is also no easy way to do a liveness check for things like kubernetes deployments of the worker process.
Since the worker no longer has a networking layer, we cannot just add an HTTP endpoint, which is the usual way of handling it.
Describe the solution you’d like
Some way to support both the internal HealthCheckRegistryService
checks (so it appears in the Admin UI) and also support for a generic liveness check.
Describe alternatives you’ve considered Some ideas:
- Add a http layer just for the liveness check. This might imply using a Nest Microservice rather than a Standalone app. What are the perf / other implications of this?
- Investigate a non-http way of doing a liveness check. Honestly no idea what this might look like, I’m asking on the Nest discord to see if anyone has any idea about this.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
No results found
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@artcoded-dev This change will allow the worker to listen for HTTP requests like this:
If your use-case requires some other kind of http endpoint, please provide more details.
Ran a few performance tests comparing the difference between bootstrapping the worker as a standalone app (the way we currently do it) vs as a full-blown Nest server including http layer. I also created a new WorkerModule which is just the same as the AppModule but omits the ApiModule import, since we would not need that for the worker.
The code looks like this (irrelevant parts elided):
Here are the results (3 runs of each):
standalone with AppModule (status quo)
server with AppModule
standalone with WorkerModule
server with WorkerModule
Conclusion
My intuition that bootstrapping a full server is much heavier than a standalone app is not correct. The main overhead is the
ApiModule
which, when omitted by using the WorkerModule, has a drastic impact on memory usage and especially startup time.So it looks perfectly viable to bootstrap the worker as a full NestJS server app, which will give us an HTTP layer we can use for the health check and at the same time reduce startup time by a factor of ~4-5x, (2s down to 400ms) which is also very helpful for the serverless use-case!
API proposal
When bootstrapping the worker as a Nest server, we need to provide a post & hostname. I propose adding an optional 2nd argument to the
bootstrapWorker()
function which allows these to be provided. If provided, a server is created. If not provided, a standalone app is created. Both server and standalone app return an object implementingINestApplicationContext
, so nothing needs to change regarding the return type of the function.Health Checks
Still to be solved is how to enable a health check indicator for the worker in the Admin UI using the existing
HealthCheckRegistryService
. Problems:Perhaps we can have an endpoint on the server than can be called to register a worker? Need to think more about this one.