Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Structured Logging (for JSON)

See original GitHub issue

Hey all, this is my very first issue - so please bear with me 😃 I’ve checked open as well as closed issues around the keywords I’d suspect to match and blame’d the files in question to get an idea what previous ones could be relevant.

Describe the feature

The logging that is currently in place seems to be mainly tailored for human beings reading the output, which is absolutely okay - but makes it way harder to process the generated logs programmatically.

In the end I’d like to present proper error reasons around failed models with all the relevant information in one record.

running the tests with dbt --log-format=json test for example generates the following in logs/dbt.log:

2020-11-24 09:09:04.745841 (MainThread): Warning in test not_null_finance__some_statuses_per_day_pan (models/marts/finance/finance__some_statuses_per_day.yml)
2020-11-24 09:09:04.750021 (MainThread):   Got 55169 results, expected 0.
2020-11-24 09:09:04.755953 (MainThread):
2020-11-24 09:09:04.758927 (MainThread):   compiled SQL at target/compiled/project/models/marts/finance/finance__some_statuses_per_day.yml/schema_test/not_null_finance__some_statuses_per_day_pan.sql

while on STDOUT we get (pretty printed for easier reading):

{
  "timestamp": "2020-11-24T09:09:04.745841Z",
  "message": "\u001b[33mWarning in test not_null_finance__some_statuses_per_day_pan (models/marts/finance/finance__some_statuses_per_day.yml)\u001b[0m",
  "channel": "dbt",
  "level": 13,
  "levelname": "WARNING",
  "thread_name": "MainThread",
  "process": 5382,
  "extra": {
    "run_started_at": "2020-11-23T14:42:09.667614+00:00",
    "invocation_id": "0ccc471d-b39e-4387-ae44-362094d1ad0a",
    "is_status_message": true,
    "run_state": "internal"
  }
}
{
  "timestamp": "2020-11-24T09:09:04.750021Z",
  "message": "  Got 55169 results, expected 0.",
  "channel": "dbt",
  "level": 14,
  "levelname": "ERROR",
  "thread_name": "MainThread",
  "process": 5382,
  "extra": {
    "run_started_at": "2020-11-23T14:42:09.667614+00:00",
    "invocation_id": "0ccc471d-b39e-4387-ae44-362094d1ad0a",
    "is_status_message": true,
    "run_state": "internal"
  }
}
{
  "timestamp": "2020-11-24T09:09:04.758927Z",
  "message": "  compiled SQL at target/compiled/project/models/marts/finance/finance__some_statuses_per_day.yml/schema_test/not_null_finance__some_statuses_per_day_pan.sql",
  "channel": "dbt",
  "level": 11,
  "levelname": "INFO",
  "thread_name": "MainThread",
  "process": 5382,
  "extra": {
    "run_started_at": "2020-11-23T14:42:09.667614+00:00",
    "invocation_id": "0ccc471d-b39e-4387-ae44-362094d1ad0a",
    "is_status_message": true,
    "run_state": "internal"
  }
}

it originates from those lines:

https://github.com/fishtown-analytics/dbt/blob/daff0badc82362d805bce0a2cc8a59701ab1fefb/core/dbt/task/printer.py#L290-L302

and i cannot even relate those lines because they have no common property.

I was toying around with the idea to refactor the way that this logging works. Which would allow the log output stay unchanged while the structure output in json could receive way more information. For the first shot probably just all available information in Logbook’s extra dict - nothing too fancy.

Describe alternatives you’ve considered

Nothing that comes to mind initially, while I’m open for different ideas to get the information I’m looking for

Additional context

To my understanding this could/should be a very generic thing, that is - depending on the implementation - not changing much for anyone using the default configuration.

Who will this benefit?

Everyone consuming logs in a different format that primarily is interesting for accessing it programmatically. Also everyone who is using any kind of logging service - it would allow people to access aggregated information in one record, rather than splattered over multiple lines

Are you interested in contributing this feature?

I’ve been in touch with dbt only as a User/Admin until now, but have already started to dig into the code base and would like to help - given your assistance and feedback.

Issue Analytics

State:
Created 3 years ago
Comments:15 (7 by maintainers)

Top GitHub Comments

1reaction

jtcohen6commented, Dec 4, 2020

@steffkes you raise some fair points! After rereading above, I also see that I missed some of the specific details in the scenarios you were laying out—sorry about that.

To explain my rationale here, I’d like to draw a distinction between:

Information about each resource (model/test/etc) while it’s running
Summary information about the entire run after it’s completed

In the standard CLI output, this separation occurs between the Finished and Completed lines:

$ dbt test
Running with dbt=0.18.1
Found 1 model, 2 tests, 0 snapshots, 0 analyses, 138 macros, 0 operations, 1 seed file, 0 sources

19:10:50 | Concurrency: 1 threads (target='dev')
19:10:50 |
19:10:50 | 1 of 2 START test not_null_my_model_id............................... [RUN]
19:10:50 | 1 of 2 WARN 1 not_null_my_model_id................................... [WARN 1 in 0.04s]
19:10:50 | 2 of 2 START test unique_my_model_id................................. [RUN]
19:10:50 | 2 of 2 PASS unique_my_model_id....................................... [PASS in 0.03s]
19:10:50 |
19:10:50 | Finished running 2 tests in 0.25s.
------------------------ <<< this is the separation I mean
Completed with 1 warning:

Warning in test not_null_my_model_id (models/resources.yml)
  Got 1 result, expected 0.

  compiled SQL at target/compiled/testy/models/resources.yml/schema_test/not_null_my_model_id.sql

Done. PASS=1 WARN=1 ERROR=0 SKIP=0 TOTAL=2

1. Real-time statuses

Up to the word “Finished,” the logs populate in real time, and they can provide useful information about which models are running, whether tests are passing, and so on. Here’s one of the same log lines as above, now JSON-formatted:

dbt --debug --no-use-colors --log-format json test

{
    "timestamp": "2020-12-04T00:04:21.609075Z",
    "message": "19:04:21 | 1 of 2 WARN 1 not_null_my_model_id................................... [WARN 1 in 0.03s]",
    "channel": "dbt",
    "level": 13,
    "levelname": "WARNING",
    "thread_name": "Thread-1",
    "process": 97168,
    "extra": {
        "unique_id": "test.testy.not_null_my_model_id",
        "run_state": "running"
    }
}

And here’s the same line for the same test, now configured with error-level severity:

{
    "timestamp": "2020-12-04T00:00:04.723733Z",
    "message": "19:00:04 | 1 of 2 FAIL 1 not_null_my_model_id................................... [FAIL 1 in 0.04s]",
    "channel": "dbt",
    "level": 14,
    "levelname": "ERROR",
    "thread_name": "Thread-1",
    "process": 97097,
    "extra": {
        "unique_id": "test.testy.not_null_my_model_id",
        "run_state": "running"
    }
}

Those are the lines that we’d hope monitoring would catch by checking the level/levelname for WARNING and ERROR. The extra.unique_id is what you can use to identify the specific resource that’s running.

You can use that unique_id to look up more information about the test in the nodes object in manifest.json, which includes its config, parents, and compiled SQL. Based on your feedback above, it sounds like that’s the contextual info you want in the extra dict, and I am open to the feedback that we could surface more information there.

2. Invocation-level summary

Aggregating, filtering, and summarizing what happened in a given run after the run is over is where the JSON artifacts really come into play. The combination of run_results.json and manifest.json will have a richer, better-organized, and more stable set of information than what’s available in the logs. Here’s a subset of the information about that one test from run_results.json:

{
    "results": [
        {
            "node": {
                "unique_id": "test.testy.not_null_my_model_id",
                ...
            },
            "error": null,
            "status": 1,
            "execution_time": 0.03606009483337402,
            "thread_id": "Thread-1",
            "timing": [
                {
                    "name": "compile",
                    "started_at": "2020-12-04T00:10:50.542076Z",
                    "completed_at": "2020-12-04T00:10:50.567851Z"
                },
                {
                    "name": "execute",
                    "started_at": "2020-12-04T00:10:50.568134Z",
                    "completed_at": "2020-12-04T00:10:50.576844Z"
                }
            ],
            "fail": null,
            "warn": true,
            "skip": false
        },

Granted, it’s still not as good as it could be; we are reorganizing that information to be more straightforward and intuitive in the next release of dbt (#2493).

That being said, you’ve got a really good point: the JSON representation of the summary lines starting with Completed are not all that helpful, since these are definitely optimized for human readability in stdout. These are the lines produced for me (running with dbt v0.18.1):

{
   "timestamp":"2020-12-04T00:00:04.737233Z",
   "message":"Failure in test not_null_my_model_id (models/resources.yml)",
   "channel":"dbt",
   "level":14,
   "levelname":"ERROR",
   "thread_name":"MainThread",
   "process":97097,
   "extra":{
      "run_started_at":"2020-12-04T00:00:03.887409+00:00",
      "invocation_id":"eb53b311-18a6-4a1d-bd51-4cc978f59512",
      "is_status_message":true,
      "run_state":"internal"
   }
}
{
   "timestamp":"2020-12-04T00:00:04.737620Z",
   "message":"  Got 1 result, expected 0.",
   "channel":"dbt",
   "level":14,
   "levelname":"ERROR",
   "thread_name":"MainThread",
   "process":97097,
   "extra":{
      "run_started_at":"2020-12-04T00:00:03.887409+00:00",
      "invocation_id":"eb53b311-18a6-4a1d-bd51-4cc978f59512",
      "is_status_message":true,
      "run_state":"internal"
   }
}
{
   "timestamp":"2020-12-04T00:00:04.738136Z",
   "message":"  compiled SQL at target/compiled/testy/models/resources.yml/schema_test/not_null_my_model_id.sql",
   "channel":"dbt",
   "level":11,
   "levelname":"INFO",
   "thread_name":"MainThread",
   "process":97097,
   "extra":{
      "run_started_at":"2020-12-04T00:00:03.887409+00:00",
      "invocation_id":"eb53b311-18a6-4a1d-bd51-4cc978f59512",
      "is_status_message":true,
      "run_state":"internal"
   }
}

I’d welcome some changes to better coordinate those “summary” log lines in their JSON output, something like:

{
   "timestamp":"2020-12-04T00:00:04.737233Z",
   "message":"Failure in test not_null_my_model_id (models/resources.yml)\n  Got 1 result, expected 0.\n  compiled SQL at target/compiled/testy/models/resources.yml/schema_test/not_null_my_model_id.sql",
   "channel":"dbt",
   "level":14,
   "levelname":"ERROR",
   "thread_name":"MainThread",
   "process":97097,
   "extra":{
      "run_started_at":"2020-12-04T00:00:03.887409+00:00",
      "invocation_id":"eb53b311-18a6-4a1d-bd51-4cc978f59512",
      "is_status_message":true,
      "run_state":"internal"
   }
}

Let me know what you think, and if the distinction I’ve drawn above makes sense for your needs.

1reaction

steffkescommented, Nov 26, 2020

Don’t want to get lost in the details … I’ve tuck a stab at this https://github.com/fishtown-analytics/dbt/compare/dev/kiyoshi-kuromiya...steffkes:feature/2915-structured-logging :

--- a/core/dbt/task/printer.py
+++ b/core/dbt/task/printer.py
@@ -281,26 +281,37 @@ def print_run_result_error(
             color = ui.red
             info = 'Failure'
             logger_fn = logger.error
-        logger_fn(color("{} in {} {} ({})").format(
+
+        messages = [color("{} in {} {} ({})").format(
             info,
             result.node.resource_type,
             result.node.name,
-            result.node.original_file_path))
+            result.node.original_file_path
+        )]
+
+        extra = {
+            'info': info,
+            'name': result.node.name,
+            'file_path': result.node.original_file_path,
+            'resource_type': result.node.resource_type,
+        }
 
         try:
             int(result.status)
         except ValueError:
-            logger.error("  Status: {}".format(result.status))
+            logger_fn = logger.error
+            messages.append("  Status: {}".format(result.status))
         else:
             status = utils.pluralize(result.status, 'result')
-            logger.error("  Got {}, expected 0.".format(status))
+            messages.append("  Got {}, expected 0.".format(status))
 
         if result.node.build_path is not None:
-            with TextOnly():
-                logger.info("")
-            logger.info("  compiled SQL at {}".format(
+            extra["build_path"] = result.node.build_path
+            messages.append("  compiled SQL at {}".format(
                 result.node.build_path))
 
+        logger_fn("\n".join(messages), extra=extra)
+
     else:
         first = True
         for line in result.error.split("\n"):

that doesn’t change the logs/dbt.log not by much:

2020-11-26 12:50:56.665579 (MainThread): Warning in test not_null_finance__some_statuses_per_day_pan (models/marts/finance/finance__some_statuses_per_day.yml)
  Got 55169 results, expected 0.
  compiled SQL at target/compiled/project/models/marts/finance/finance__some_statuses_per_day.yml/schema_test/not_null_finance__some_statuses_per_day_pan.sql

but would generate one log entry like this:

{
  "timestamp": "2020-11-26T12:50:56.665579Z",
  "message": "\u001b[33mWarning in test not_null_finance__some_statuses_per_day_pan (models/marts/finance/finance__some_statuses_per_day.yml)\u001b[0m\n  Got 55169 results, expected 0.\n  compiled SQL at target/compiled/project/models/marts/finance/finance__some_statuses_per_day.yml/schema_test/not_null_finance__some_statuses_per_day_pan.sql",
  "channel": "dbt",
  "level": 13,
  "levelname": "WARNING",
  "thread_name": "MainThread",
  "process": 22851,
  "extra": {
    "info": "Warning",
    "name": "not_null_finance__some_statuses_per_day_pan",
    "file_path": "models/marts/finance/finance__some_statuses_per_day.yml",
    "resource_type": "test",
    "build_path": "target/compiled/project/models/marts/finance/finance__some_statuses_per_day.yml/schema_test/not_null_finance__card_statuses_per_day_pan.sql",
    "run_started_at": "2020-11-26T12:49:34.410019+00:00",
    "invocation_id": "e83ba4af-a60b-4229-aca2-433230436a2b",
    "is_status_message": true,
    "run_state": "internal"
  }
}

which would allow you way more in terms of aggregating and filtering stuff, don’t you think @jtcohen6 ?

It’s probably not the way it should/would be implemented in the end, more like a quick way to demonstrate the idea i had in mind - without changing to much of the existing code, making it easy/easier to follow it.