question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Proposal] Extend logging framework to support structured log output and custom log format

See original GitHub issue

Is your feature request related to a problem? Please describe. Observability is a very important capability for a distributed system. We have encountered several problems many times while maintaining Doris, such as:

  • Difficult to troubleshoot through log. Take a load transaction for example, we need to access into multiple machines and searching by multiple signatures (Label/TransactionID/QueryID), which cost a lot of time. Some logs don’t even have any signatures, so we just have to guess which transaction it belongs to.
  • Logs are not structured, which makes analyzing logs difficult, many is impossible.
  • Our company has a centralized log collection system, which have custom log format. Even if some logs have a weak formatted output, it cannot match our custom log format. We have to custom logging for the most important processes which are query and stream load.

Describe the solution you’d like We are considering a generic logging framework extension to support structured logging in Doris, and support different Doris maintainers to configure their own structured logging output format.

// unstructured logging, output 'here is an info for a query, queryId=xxx'
LOG.info("here is an info for a query, queryId={}", queryId);
// structured logging, output custom log format, like 'here is an info for a query {"queryId":"xxx"}'
LOG.tag("queryId", queryId).info("here is an info for a query");

This allows maintainers to collect logs and transfer anyhow they want. In our case, we will collect logs into our log center and transfer them into relational records, so we can process or analyze logs in a table, maybe many tables. This can happen without custom logging statements. Contributors can focus on adding useful information to logs and cleaning up useless ones. Doris may need to set up specifications for tag names, like CamelCase or underline_style, or provide common tag methods and let maintainers customize their own tag names. This is open for discussion

Describe alternatives you’ve considered We have considered import some observability framework such as OpenTelemetry. The current situation is that OpenTelemetry is still exploring many capabilities. For example, It doesn’t support thrift in the official distribution; The cpp implementation is in pre-alpha; The logging integration is immature… We can extend the logging capabilities to support flexible monitoring and analysis for Doris clusters’ maintainers. At the same time, we can introduce OpenTelemetry to collect trace and metric data for telemetry, which does not conflict with log extension. Perhaps when OpenTelemetry is capable enough for logging, we can clean up useless logs then.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:4
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

7reactions
ccofflinecommented, Aug 31, 2021

It is useful, but a very huge code refactor.

No need to refactor all logs at once. Just like the demo code above, It’s an extension for the Logger interface, which is fully compatible with existing methods. Tagged logs can replace unstructured logs piece by piece while imporving the logging content.

3reactions
ccofflinecommented, Aug 31, 2021

There are already similar designs in Doris. Logbuilder is used in some code, but not widely. https://github.com/apache/incubator-doris/blob/647170c4391e922aa33a81c391a6508376788d0f/fe/fe-core/src/main/java/org/apache/doris/load/loadv2/BulkLoadJob.java#L212-L218 As you can see, to add tags on a log, Logbuilder is kind of ugly, and the format of the output log has been determined and cannot be customized.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Structured Logging - kubernetes/enhancements - GitHub
Summary. This KEP proposes to define standard structure for Kubernetes log messages, add methods to klog to enforce this structure, add ability to...
Read more >
Structured logging - Google Cloud
When you enable structured logging, the listed logs are converted to log entries with different formats than they had before you enabled structured...
Read more >
6 Factors to Consider When Choosing a Logging Framework
First-class support for structured logging formats. Currently, most logging frameworks default to outputting unstructured log data primarily ...
Read more >
Logging Cookbook — Python 3.11.1 documentation
Let's say you want to log to console and file with different message formats and in differing circumstances. Say you want to log...
Read more >
Python Logging Guide - Best Practices and Hands-on Examples
The logging.Logger objects offer the primary interface to the logging library. These objects provide the logging methods to issue log requests ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found