question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG]mlflow gc command raise exception when serve-artifacts as local file

See original GitHub issue

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

System information

  • Have I written custom code (as opposed to using a stock example script provided in MLflow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
  • MLflow installed from (source or binary): binary
  • MLflow version (run mlflow --version): 1.25.1
  • Python version: 3.9
  • npm version, if running the dev UI: None

Describe the problem

mlflow gc command raise exception when serve-artifacts as local file

my solution is add code to mlflow/mlflow/store/artifact/mlflow_artifacts_repo.py line 61

        # if uri is file, return the artifacts under "./mlartifacts", same fold with mlruns
        if track_parse.scheme == "file":

            return os.path.join(os.path.dirname(track_parse.path),"mlartifacts",uri_parse.path[1:])

Tracking information

No response

Code to reproduce issue

cd ./mlflow-root
mlflow server -h 0.0.0.0 -p 18888 --serve-artifacts

after delete mlflow-runs and under mlflow-root then use

mlflow gc

Other info / logs

Traceback (most recent call last):
  File "/opt/anaconda3/envs/mlflow/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/cli.py", line 489, in gc
    artifact_repo = get_artifact_repository(run.info.artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 107, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 73, in get_artifact_repository
    return repository(artifact_uri)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 46, in __init__
    super().__init__(self.resolve_uri(artifact_uri, get_tracking_uri()))
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 61, in resolve_uri
    _validate_uri_scheme(track_parse.scheme)
  File "/opt/anaconda3/envs/mlflow/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:6

github_iconTop GitHub Comments

3reactions
vanIvancommented, Jul 12, 2022

Had the same error. Was able to fix this by calling mlflow.set_tracking_uri(tracking_uri) before any call to mlflow API on client side (error caused by mlflow call to get_tracking_uri() and getting default tracking_uri, which has file schema):

tracking_uri = os.environ["MLFLOW_TRACKING_URL"]
mlflow.set_tracking_uri(tracking_uri)
logger = MLFlowLogger(
    run_name=run_name,
    experiment_name=experiment_name, 
    tracking_uri=tracking_uri
)  
0reactions
DraXuscommented, Aug 24, 2022

Thanks @okoben, setting MLFLOW_TRACKING_URI also fixed my issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

8. Errors and Exceptions — Python 3.11.1 documentation
The error is caused by (or at least detected at) the token preceding the arrow: in the example, the error is detected at...
Read more >
3.2 Understand the OutOfMemoryError Exception
Action: Increase the heap size. The java.lang.OutOfMemoryError exception for GC Overhead limit exceeded can be turned off with the command line flag ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found