[BUG] No basic auth in MlflowArtifactsRepository
See original GitHub issueWillingness to contribute
The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?
- Yes. I can contribute a fix for this bug independently.
- Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.
- No. I cannot contribute a bug fix at this time.
System information
- Have I written custom code (as opposed to using a stock example script provided in MLflow): no
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux
- MLflow installed from (source or binary): binary
- MLflow version (run
mlflow --version
): 1.23.0 - Python version: 3.9
- npm version, if running the dev UI:
- Exact command to reproduce:
Describe the problem
We are trying to use the MLFlow server as a proxy to push artifacts to S3 using the --serve-artifacts
flag. Our MLFlow server is behind a reverse proxy which requires basic auth. We use the default artifacts uri mlflow-artifacts:/
, without specifying a host, which means clients will assume the host is the tracking server uri. To support basic auth for the tracking server, clients have the option to set MLFLOW_TRACKING_USERNAME
and MLFLOW_TRACKING_PASSWORD
, however these variables are not included in calls to the artifact proxy (HttpArtifactRepository
constructs a Session
without any authentication here). There are also no dedicated variables for the artifacts proxy specifically. Providing username and password in the url directly is also not possible since these are stripped from the tracking uri when constructing the artifacts proxy uri (here).
When the tracking server is used as the artifacts proxy we would expect calls to the artifacts proxy to include the same authentication headers as specified for the tracking server.
A pragmatic fix would be something like https://github.com/mlflow/mlflow/compare/master...TimNooren:mlflow_artifact_repo_basic_auth, but maybe the implementation could rely more on what is already provided in mlflow.utils.rest_utils
(more similar to mlflow.store.tracking.rest_store.RestStore
). Some guidance here would be great:)
Code to reproduce issue
import os
from mlflow.store.artifact.mlflow_artifacts_repo import MlflowArtifactsRepository
from mlflow.store.tracking import DEFAULT_ARTIFACTS_URI
os.environ["MLFLOW_TRACKING_URI"] = "https://my.mlflow.server:443/" # Using basic auth
os.environ["MLFLOW_TRACKING_USERNAME"] = "username"
os.environ["MLFLOW_TRACKING_PASSWORD"] = "password"
MlflowArtifactsRepository(DEFAULT_ARTIFACTS_URI).list_artifacts(). # Unauthorized
Other info / logs
What component(s), interfaces, languages, and integrations does this bug affect?
Components
-
area/artifacts
: Artifact stores and artifact logging -
area/build
: Build and test infrastructure for MLflow -
area/docs
: MLflow documentation pages -
area/examples
: Example code -
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models
: MLmodel format, model serialization/deserialization, flavors -
area/projects
: MLproject format, project running backends -
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra
: MLflow Tracking server backend -
area/tracking
: Tracking Service, tracking client APIs, autologging
Interface
-
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker
: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows
: Windows support
Language
-
language/r
: R APIs and clients -
language/java
: Java APIs and clients -
language/new
: Proposals for new client languages
Integrations
-
integrations/azure
: Azure and Azure ML integrations -
integrations/sagemaker
: SageMaker integrations -
integrations/databricks
: Databricks integrations
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
@cgebe you’re right, this PR does not address your issue:) But I believe this was fixed in https://github.com/mlflow/mlflow/pull/5385.
Thank you, @TimNooren !