default logging configuration doesn't log other packages than kedro
See original GitHub issueDescription
since the 0.18.2 release, I lost the logs for the other packages than kedro in my projects, if I don’t specify them in the logging.yaml I tested with multiple packages like custom kedro plugins like kedro-mlflow (@Galileo-Galilei) and other packages like paramiko. If I want to have logs for thoses packages I need to update logging.yml
loggers:
kedro:
level: INFO
<my_package>:
level: INFO
kedro-mlflow:
level: INFO
paramiko:
level: INFO
I think it’s a bug because for me it’s difficult to ask users to add all their packages in the logging.yaml. I think the default behavior we want is to have all logs by default and use logging.yaml to remove logs we don’t want.
Context
I need all my logs by default to control the correct behavior of my projects. And if I don’t want some of them I remove them
Steps to Reproduce
- install kedro 0.18.2
- install kedro plugin like kedro-mlflow or another package like paramiko
- run your project. You will see that you don’t have kedro-mlflow logs or paramiko’s logs
- update your logging.yaml like above and run your project. You will have logs
Expected Result
By default, we should have all logs.
Actual Result
We just have kedro logs.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
- Kedro version used (
pip show kedro
orkedro -V
): 0.18.2 - Python version used (
python -V
): 3.9.13 - Operating system and version: linux
Thank you for your help 😃
Issue Analytics
- State:
- Created a year ago
- Comments:12 (11 by maintainers)
Top GitHub Comments
thank you for your reply. I need to think about this but for now :
Hello @Debbby57 and @Galileo-Galilei, and thanks as ever for raising the issue here! I’m responsible for this change so let me try to explain a bit.
This was indeed a change made in 0.18.1:
So this change was indeed intentional, and the reason this was designated a bug fix/minor change rather than a breaking change was because (a) the previous behaviour on Kedro was arguably incorrect and (b) I didn’t anticipate this change affecting anyone (indeed you’re the first people to even notice it). In hindsight probably this should have been noted under “Minor breaking changes to the API” instead though. So sorry about this and the awkwardness it has caused you!
First note that the default logging level in Python is set to
WARNING
. Prior to 0.18.1, the logging config went like this:i.e. we changed the root logging level to
INFO
. This results in a lot of logs being emitted byanyconfig
, hence the manual setting of that level toWARNING.
There’s been other places where people have needed to manually alter the logging config to reduce the verbosity of logs from other packages, e.g.py4j.java_gateway
on databricks.Why did I not like this? Because it’s heavy-handed and disruptive since it alters the behaviour of the
root
logger from the expected Python default ofWARNING
. Hence any other libraries outside kedro that perform logging will be affected unexpectedly (which means putting in place special cases to handleanyconfig
etc.).So in 0.18.1 this was changed to:
i.e. we leave the
root
logger level alone and just makekedro
the special case. This feels much cleaner because it’s only altering the logging configuration for the package that we’re responsible for and not interfering with any others. Hence there’s no need for theanyconfig
etc. special cases any more either.Note this actually immediately causes a new problem for us, since any
INFO
-level logs emitted by a project would no longer be shown. I regarded this as unexpected/undesired behaviour and a breaking change, which is why the project logging.yml config also sets{{ cookiecutter.python_package }}: level: INFO
to ensure those messages are still shown as they were before 0.18.1. As above, I considered the behaviour of any packages outside kedro and the current project to be “outside” our responsibility.Unfortunately packages outside our responsibility also include plugins such as kedro-mlflow. I hadn’t really considered this specifically before and do see why it would be a problem, sorry! So what can be done about this?
kedro-mlflow: level: INFO
to your project logging.yml file as @Debbby57 suggested above. IMO this is the “correct” solution in a pure sense, but I do appreciate it’s a bit annoyingroot
logger in your project logging.yml file toINFO
. This is a bit easier since you don’t need to change the config for every additional new package but might mean you need to silence verbose packages likeanyconfig
kedro
namespace for your logging. Through logging propagation this would then output at levelINFO
. This would mean, for example,kedro-mlflow
emitting logs using the namespacekedro.mlflow
. This is maybe a little bit ugly/confusing (and wouldn’t work forparamiko
etc.) but should mean that users don’t need to edit their logging configINFO
set that in their logging config, e.g.logging.setLevel()
. Since we leavedisable_existing_loggers: False
then I think this should work but might take a bit of fiddling round (in my experience the propagation/handling of Python loggers does not work quite as you’d expect. Possibly we need to setroot
logger level explicitly toNOTSET
or something). If we can get this working then there would be no need for a user to alter their logging configuration thoughINFO
level forkedro-mlflow
,paramiko
, etc. This doesn’t feel quite right to me though. It seems better than the special case we used to have foranyconfig
(since it doesn’t alter the Python defaultroot
logging level) but still not idealroot
logger levelINFO
like it used to be. But, as explained above, this change was made for some good reasons, and so I would be reluctant to do this unless there’s a compelling argument to do soLet me know what you think! My feeling is that the right solution here would be 4 for plugins like kedro-mlflow, and 1 or 2 for other packages like paramiko. I am still willing to be convinced that option 5 or 6 is the right solution here, but currently they don’t feel right to me. Maybe there’s another solution I didn’t think of, so if you have any ideas then please do say. I hope this helps to explain the change and the balance of factors that need to be considered here anyway.