[Proposal] Extend logging framework to support structured log output and custom log format
See original GitHub issueIs your feature request related to a problem? Please describe. Observability is a very important capability for a distributed system. We have encountered several problems many times while maintaining Doris, such as:
- Difficult to troubleshoot through log. Take a load transaction for example, we need to access into multiple machines and searching by multiple signatures (Label/TransactionID/QueryID), which cost a lot of time. Some logs don’t even have any signatures, so we just have to guess which transaction it belongs to.
- Logs are not structured, which makes analyzing logs difficult, many is impossible.
- Our company has a centralized log collection system, which have custom log format. Even if some logs have a weak formatted output, it cannot match our custom log format. We have to custom logging for the most important processes which are query and stream load.
Describe the solution you’d like We are considering a generic logging framework extension to support structured logging in Doris, and support different Doris maintainers to configure their own structured logging output format.
// unstructured logging, output 'here is an info for a query, queryId=xxx'
LOG.info("here is an info for a query, queryId={}", queryId);
// structured logging, output custom log format, like 'here is an info for a query {"queryId":"xxx"}'
LOG.tag("queryId", queryId).info("here is an info for a query");
This allows maintainers to collect logs and transfer anyhow they want. In our case, we will collect logs into our log center and transfer them into relational records, so we can process or analyze logs in a table, maybe many tables. This can happen without custom logging statements. Contributors can focus on adding useful information to logs and cleaning up useless ones. Doris may need to set up specifications for tag names, like CamelCase or underline_style, or provide common tag methods and let maintainers customize their own tag names. This is open for discussion
Describe alternatives you’ve considered
We have considered import some observability framework such as OpenTelemetry
. The current situation is that OpenTelemetry is still exploring many capabilities. For example, It doesn’t support thrift in the official distribution; The cpp implementation is in pre-alpha; The logging integration is immature…
We can extend the logging capabilities to support flexible monitoring and analysis for Doris clusters’ maintainers. At the same time, we can introduce OpenTelemetry
to collect trace and metric data for telemetry, which does not conflict with log extension. Perhaps when OpenTelemetry
is capable enough for logging, we can clean up useless logs then.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:4
- Comments:6 (5 by maintainers)
No need to refactor all logs at once. Just like the demo code above, It’s an extension for the
Logger
interface, which is fully compatible with existing methods. Tagged logs can replace unstructured logs piece by piece while imporving the logging content.There are already similar designs in Doris.
Logbuilder
is used in some code, but not widely. https://github.com/apache/incubator-doris/blob/647170c4391e922aa33a81c391a6508376788d0f/fe/fe-core/src/main/java/org/apache/doris/load/loadv2/BulkLoadJob.java#L212-L218 As you can see, to add tags on a log,Logbuilder
is kind of ugly, and the format of the output log has been determined and cannot be customized.