question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

default logging configuration doesn't log other packages than kedro

See original GitHub issue

Description

since the 0.18.2 release, I lost the logs for the other packages than kedro in my projects, if I don’t specify them in the logging.yaml I tested with multiple packages like custom kedro plugins like kedro-mlflow (@Galileo-Galilei) and other packages like paramiko. If I want to have logs for thoses packages I need to update logging.yml

loggers:
    kedro:
        level: INFO
    <my_package>:
        level: INFO
    kedro-mlflow:
        level: INFO
    paramiko:
        level: INFO

I think it’s a bug because for me it’s difficult to ask users to add all their packages in the logging.yaml. I think the default behavior we want is to have all logs by default and use logging.yaml to remove logs we don’t want.

Context

I need all my logs by default to control the correct behavior of my projects. And if I don’t want some of them I remove them

Steps to Reproduce

  1. install kedro 0.18.2
  2. install kedro plugin like kedro-mlflow or another package like paramiko
  3. run your project. You will see that you don’t have kedro-mlflow logs or paramiko’s logs
  4. update your logging.yaml like above and run your project. You will have logs

Expected Result

By default, we should have all logs.

Actual Result

We just have kedro logs.

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

  • Kedro version used (pip show kedro or kedro -V): 0.18.2
  • Python version used (python -V): 3.9.13
  • Operating system and version: linux

Thank you for your help 😃

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:12 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
Debbby57commented, Jul 28, 2022

thank you for your reply. I need to think about this but for now :

  • I still don’t like option 1, because as a user I don’t want to search and list packages for which I want infos logs. I could be very time-consuming.
  • short term, I’m going to use option 2
  • I don’t like option 3 but I’m just a user of kedro-plugins so I think my opinion is really facultative 😉
  • Option 4 seems to be unreachable for packages which are not kedro-plugins
  • I agree that option 5 is not a good option.
  • I understand why you don’t want the option 6.
1reaction
AntonyMilneQBcommented, Jul 28, 2022

Hello @Debbby57 and @Galileo-Galilei, and thanks as ever for raising the issue here! I’m responsible for this change so let me try to explain a bit.

This was indeed a change made in 0.18.1:

The root logger is now set to the Python default level of WARNING rather than INFO. Kedro’s logger is still set to emit INFO level messages.

So this change was indeed intentional, and the reason this was designated a bug fix/minor change rather than a breaking change was because (a) the previous behaviour on Kedro was arguably incorrect and (b) I didn’t anticipate this change affecting anyone (indeed you’re the first people to even notice it). In hindsight probably this should have been noted under “Minor breaking changes to the API” instead though. So sorry about this and the awkwardness it has caused you!

First note that the default logging level in Python is set to WARNING. Prior to 0.18.1, the logging config went like this:

loggers:
    anyconfig:
        level: WARNING
        handlers: [console]
        propagate: no

root:
    level: INFO
    handlers: [console, info_file_handler, error_file_handler]

i.e. we changed the root logging level to INFO. This results in a lot of logs being emitted by anyconfig, hence the manual setting of that level to WARNING. There’s been other places where people have needed to manually alter the logging config to reduce the verbosity of logs from other packages, e.g.py4j.java_gateway on databricks.

Why did I not like this? Because it’s heavy-handed and disruptive since it alters the behaviour of the root logger from the expected Python default of WARNING. Hence any other libraries outside kedro that perform logging will be affected unexpectedly (which means putting in place special cases to handle anyconfig etc.).

So in 0.18.1 this was changed to:

loggers:
  kedro:
    level: INFO

root:
  handlers: [rich]

i.e. we leave the root logger level alone and just make kedro the special case. This feels much cleaner because it’s only altering the logging configuration for the package that we’re responsible for and not interfering with any others. Hence there’s no need for the anyconfig etc. special cases any more either.

Note this actually immediately causes a new problem for us, since any INFO-level logs emitted by a project would no longer be shown. I regarded this as unexpected/undesired behaviour and a breaking change, which is why the project logging.yml config also sets {{ cookiecutter.python_package }}: level: INFO to ensure those messages are still shown as they were before 0.18.1. As above, I considered the behaviour of any packages outside kedro and the current project to be “outside” our responsibility.

Unfortunately packages outside our responsibility also include plugins such as kedro-mlflow. I hadn’t really considered this specifically before and do see why it would be a problem, sorry! So what can be done about this?

  1. you add kedro-mlflow: level: INFO to your project logging.yml file as @Debbby57 suggested above. IMO this is the “correct” solution in a pure sense, but I do appreciate it’s a bit annoying
  2. you alter the level of the root logger in your project logging.yml file to INFO. This is a bit easier since you don’t need to change the config for every additional new package but might mean you need to silence verbose packages like anyconfig
  3. use the kedro namespace for your logging. Through logging propagation this would then output at level INFO. This would mean, for example, kedro-mlflow emitting logs using the namespace kedro.mlflow. This is maybe a little bit ugly/confusing (and wouldn’t work for paramiko etc.) but should mean that users don’t need to edit their logging config
  4. packages that want to emit logs at level INFO set that in their logging config, e.g. logging.setLevel(). Since we leave disable_existing_loggers: False then I think this should work but might take a bit of fiddling round (in my experience the propagation/handling of Python loggers does not work quite as you’d expect. Possibly we need to set root logger level explicitly to NOTSET or something). If we can get this working then there would be no need for a user to alter their logging configuration though
  5. we add special cases inside kedro to output messages at INFO level for kedro-mlflow, paramiko, etc. This doesn’t feel quite right to me though. It seems better than the special case we used to have for anyconfig (since it doesn’t alter the Python default root logging level) but still not ideal
  6. we revert the behaviour to make the root logger level INFO like it used to be. But, as explained above, this change was made for some good reasons, and so I would be reluctant to do this unless there’s a compelling argument to do so

Let me know what you think! My feeling is that the right solution here would be 4 for plugins like kedro-mlflow, and 1 or 2 for other packages like paramiko. I am still willing to be convinced that option 5 or 6 is the right solution here, but currently they don’t feel right to me. Maybe there’s another solution I didn’t think of, so if you have any ideas then please do say. I hope this helps to explain the change and the balance of factors that need to be considered here anyway.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Logging — Kedro 0.18.4 documentation - Read the Docs
Kedro uses Python's logging library. Configuration is provided as a dictionary according to the Python logging configuration schema in two places: Default ......
Read more >
Improve logging setup in Kedro · Issue #1461 - GitHub
Here is a small list of the problems our current logging setup causes: Kedro logs at root level, rather than kedro.
Read more >
kedro Changelog - pyup.io
Changed default `False` value for rich logging `show_locals`, to make sure credentials and other sensitive data isn't shown in logs.
Read more >
Python logging not outputting anything - Stack Overflow
basicConfig() # By default the root logger is set to WARNING and all loggers you define # inherit that value. Here we set...
Read more >
A Step-by-Step Guide to Python Logging - Pylenin
In this article, you will learn about python logging module for logging events and why logging is better than print.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found