Logging to S3 stops working when MLFlow import added to file
See original GitHub issueApache Airflow version: Apache Airflow [1.10.12]
Kubernetes version (if you are using kubernetes) (use kubectl version):
Client Version: version.Info{Major:“1”, Minor:“18”, GitVersion:“v1.18.2”, GitCommit:“52c56ce7a8272c798dbc29846288d7cd9fbae032”, GitTreeState:“clean”, BuildDate:“2020-04-16T11:56:40Z”, GoVersion:“go1.13.9”, Compiler:“gc”, Platform:“linux/amd64”}
Server Version: version.Info{Major:“1”, Minor:“13”, GitVersion:“v1.13.12”, GitCommit:“a8b52209ee172232b6db7a6e0ce2adc77458829f”, GitTreeState:“clean”, BuildDate:“2019-10-15T12:04:30Z”, GoVersion:“go1.11.13”, Compiler:“gc”, Platform:“linux/amd64”}
Environment:
- Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a): - Install tools:
- Others:
What happened: I want to save logs on S3 storage. I’ve added proper configuration. Logging is working fine, unless I add import of MLFlow library in any of the files.
I don’t even have to use this tool, just
from mlflow.tracking import MlflowClient
is enough to break the logging to S3.
What you expected to happen: There is probably some mismatch in s3 credentials, but I don’t see any specific error message in logs.
How to reproduce it:
- Setup logging to s3 with proper configuration entries or env vars
AIRFLOW__CORE__REMOTE_LOGGING: "True"
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER: "s3://bucket/airflow/logs"
AIRFLOW__CORE__REMOTE_LOG_CONN_ID: "s3_connection_id"
AIRFLOW__CORE__ENCRYPT_S3_LOGS: "False"
- Logging to s3 should work fine in sample dag.
- Add
from mlflow.tracking import MlflowClientto DAG file - logging to s3 is now not working.
Anything else we need to know: This problem occurs every time, when MLFlow is imported to any file processed by Airflow.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:7 (5 by maintainers)

Top Related StackOverflow Question
I opened an issue with the mlflow project as it is likely an issue with their logging config. I did however find a work around which is to import mlflow inside of a function so that the import doesn’t happen until the task’s run time.
Just ran into this issue as our team is starting to use MLFlow:
airflow: 2.1.3 (kubernetes executor)mlflow: 1.20.0All of our DAGs are able to send logs up to S3 but any DAGs that import MLFlow silently fail to upload the logs to s3. Tasks in the DAGs behave normally and can even sync other data to s3 just fine but the logging code does not appear to be running.
It feels like the MLFlow code is overriding the task log handler that we use to write the logs to s3. MLFlow init file does load a logging config (init file + logging config) Could be related? I’ll be filing an issue with MLFlow’s project.