question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Migrate environment variables used in MLflow (e.g. `MLFLOW_S3_ENDPOINT_URL`) to the `mlflow.environment_variables` module

See original GitHub issue

In https://github.com/mlflow/mlflow/pull/5745, we added the mlflow.environment_variables module to define environment variables used in MLflow in one place and provide a document for them. We should migrate the environment variables in the table below to this module as we did in https://github.com/mlflow/mlflow/pull/6375.

Name Location Assignee PR
MLFLOW_TRACKING_DIR https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/store/tracking/file_store.py#L79
MLFLOW_ENABLE_DBFS_FUSE_ARTIFACT_REPO https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/store/artifact/dbfs_artifact_repo.py#L29
MLFLOW_DISABLE_ENV_CREATION https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/models/docker_utils.py#L79
MLFLOW_DEPLOYMENT_FLAVOR_NAME https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/models/container/__init__.py#L32
MLFLOW_PIPELINES_PROFILE https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/pipelines/utils/__init__.py#L13
MLFLOW_PIPELINES_EXECUTION_DIRECTORY https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/pipelines/utils/execution.py#L12
MLFLOW_SAGEMAKER_DEPLOY_IMG_URL https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/sagemaker/__init__.py#L36
MLFLOW_EXPERIMENT_NAME https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/fluent.py#L57
MLFLOW_EXPERIMENT_ID https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/fluent.py#L56
MLFLOW_RUN_ID https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/fluent.py#L58
MLFLOW_RUN_CONTEXT https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/context/system_environment_context.py#L7
MLFLOW_TRACKING_USERNAME https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L21
MLFLOW_TRACKING_URI https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L17
MLFLOW_TRACKING_SERVER_CERT_PATH https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L28
MLFLOW_TRACKING_INSECURE_TLS https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L27
MLFLOW_TRACKING_TOKEN https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L23
MLFLOW_TRACKING_CLIENT_CERT_PATH https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L32
MLFLOW_TRACKING_PASSWORD https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/tracking/_tracking_service/utils.py#L22
MLFLOW_CONDA_HOME https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/utils/conda.py#L13
MLFLOW_CONDA_CREATE_ENV_CMD https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/utils/conda.py#L21
MLFLOW_ENV_ROOT https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/utils/virtualenv.py#L23
MLFLOW_AUTOLOGGING_TESTING https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/utils/autologging_utils/safety.py#L25
MLFLOW_AUTOLOGGING_TESTING https://github.com/mlflow/mlflow/blob/1b89add428da0a4453e7523a20d13f04f6291f37/mlflow/utils/autologging_utils/__init__.py#L45
Code toe generate this table
import ast
from pathlib import Path
from typing import Iterable, Set, Dict, Tuple
import re
from collections import defaultdict
import pandas as pd


def iter_python_scripts(root: str) -> Iterable[Path]:
    for p in Path(root).rglob("*"):
        if p.name.endswith(".py"):
            yield p


def read_file(path: Path) -> str:
    return path.read_text()


class Visitor(ast.NodeVisitor):
    def __init__(self) -> None:
        super().__init__()
        self.nodes: Set[str] = set()

    def visit_Assign(self, node: ast.Assign):
        if isinstance(node.value, ast.Str) and re.match(r"^MLFLOW_[A-Z0-9_]+$", node.value.s):
            self.nodes.add((node.value.s, node.lineno))
        self.generic_visit(node)


def main() -> None:
    envs: Dict[str, Set(Tuple(str, int))] = dict()
    for d in ["mlflow"]:
        for path in iter_python_scripts(d):
            if str(path) == "mlflow/environment_variables.py":
                continue
            visitor = Visitor()
            src = read_file(path)
            root = ast.parse(src)
            visitor.visit(root)
            if visitor.nodes:
                envs[str(path)] = visitor.nodes

    data = []
    for path, vals in envs.items():
        for (name, lineno) in vals:
            data.append(
                (
                    f"`{name}`",
                    "https://github.com/mlflow/mlflow/blob/{}/{}#L{}".format(
                        "1b89add428da0a4453e7523a20d13f04f6291f37", path, lineno
                    ),
                    "",
                    "",
                )
            )

    print(
        pd.DataFrame(data, columns=["Name", "Location", "Assignee", "PR"]).to_markdown(index=False)
    )


if __name__ == "__main__":
    main()
Old table
Location Assignee PR
https://github.com/mlflow/mlflow/blob/c98313482168137e39a6e6b0ed7169de5bfe0ba5/mlflow/data.py#L37-L38 WON’T DO. It looks like _fetch_s3 is not used at all -
https://github.com/mlflow/mlflow/blob/f7a42be1e29ec1ff1f6086cdb79d75248f877715/mlflow/store/artifact/s3_artifact_repo.py#L79 @ahlag https://github.com/mlflow/mlflow/pull/6438
https://github.com/mlflow/mlflow/blob/2a05d7d0cc65d4d72bc42b47321ae5b5011944dc/mlflow/projects/backend/local.py#L362-L366 @ahlag https://github.com/mlflow/mlflow/pull/6438
https://github.com/mlflow/mlflow/blob/4422fafb94a0003cfa6170ca3a9692bea518234e/mlflow/store/db/utils.py#L18-L20 @harupy #6396

(I’ll add more environment variables to the table.)

Example PR:

https://github.com/mlflow/mlflow/pull/6375

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
mlflow-automationcommented, Aug 25, 2022

@harupy Please reply to comments.

1reaction
ahlagcommented, Aug 4, 2022

@harupy Can I try?

Read more comments on GitHub >

github_iconTop Results From Across the Web

mlflow.environment_variables — MLflow 2.0.1 documentation
mlflow.environment_variables. This module defines environment variables used in MLflow. mlflow.environment_variables. MLFLOW_DFS_TMP = 'MLFLOW_DFS_TMP'.
Read more >
MLflow Tracking — MLflow 0.4.1 documentation
MLflow Tracking lets you log and query experiments using both Python and REST ... Set the MLFLOW_TRACKING_URI environment variable to a server's URI...
Read more >
Command-Line Interface — MLflow 2.0.1 documentation
To manage artifacts for a run associated with a tracking server, set the MLFLOW_TRACKING_URI environment variable to the URL of the desired server....
Read more >
MLflow Projects — MLflow 2.0.1 documentation
The software environment that should be used to execute project entry points. ... Example 2: Mounting volumes and specifying environment variables.
Read more >
Source code for mlflow.utils.environment - Documentation
:param build_dependencies: List of build dependencies for the environment ... This is done to ensure that the pip inside the conda environment is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found