Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Cannot use databricks connection with R

See original GitHub issue

Issues Policy acknowledgement

I have read and agree to submit bug reports in accordance with the issues policy

Willingness to contribute

Yes. I would be willing to contribute a fix for this bug with guidance from the MLflow community.

MLflow version

mlflow, version 1.27.0

System information

Window
Python version: Python 3.10.0
R version: R version 4.1.2 (2021-11-01)
mlflow R package version: 2.0.1

Describe the problem

I am attempting to track and log parameters within DataBricks using mlflow. I can see the runs that I am creating quite clearly:

However there is an issue whereby I cannot actually log anything to this run. This is due to the way the package sets the active run ID - or rather doesn’t set it - when there is a client provided. See here. The result of this is that whenever you try to log something, it will not work as there is a check for the active run ID that takes place here. Ultimately this is because mlflow:::mlflow_get_active_run_id() is NULL because it is never set in mlflow::mlflow_start_run() when there is a client_id provided.

Tracking information

No response

Code to reproduce issue

library(mlflow)
client <- mlflow::mlflow_client(tracking_uri = "databricks")
experiment <- "1709256526326232"
run <- mlflow::mlflow_start_run(experiment_id = "1709256526326232", client = client)

mlflow:::mlflow_get_active_run_id()
# NULL

# Try to log a parameter
with(run, {
  mlflow::mlflow_log_param(
    key = "test",
    value = 1,
    client = client
  )
})
# Error: `with()` should only be used with `mlflow_start_run()`.

# Try to use `mlflow::mlflow_start_run()`:
with(
  mlflow::mlflow_start_run(
    experiment_id = "1709256526326232",
    client = mlflow::mlflow_client(tracking_uri = "databricks")
  ), {
  mlflow::mlflow_log_param(
    key = "test",
    value = 1,
    client = client
  )
})
# Error: `with()` should only be used with `mlflow_start_run()`.

Stack trace

3: stop("`with()` should only be used with `mlflow_start_run()`.",
       call. = FALSE)
2: with.mlflow_run(mlflow::mlflow_start_run(experiment_id = "1709256526326232", 
       client = mlflow::mlflow_client(tracking_uri = "databricks")),
       {
           mlflow::mlflow_log_param(key = "test", value = 1, client = client)   
       })
1: with(mlflow::mlflow_start_run(experiment_id = "1709256526326232",
       client = mlflow::mlflow_client(tracking_uri = "databricks")),
       {
           mlflow::mlflow_log_param(key = "test", value = 1, client = client)
       })

Other info / logs

No response

What component(s) does this bug affect?

area/artifacts: Artifact stores and artifact logging
area/build: Build and test infrastructure for MLflow
area/docs: MLflow documentation pages
area/examples: Example code
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
area/models: MLmodel format, model serialization/deserialization, flavors
area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
area/projects: MLproject format, project running backends
area/scoring: MLflow Model server, model deployment tools, Spark UDFs
area/server-infra: MLflow Tracking server backend
area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

What language(s) does this bug affect?

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

What integration(s) does this bug affect?

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

Issue Analytics

State:
Created 10 months ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

nathaneastwoodcommented, Nov 29, 2022

Good question. Seems I had this set many months ago when I was first working with this code locally. Now we have DataBricks set up I have come back to it and so continued to use it which is where I found the issue. I think we can close this now, thanks for your (very) quick help!

0reactions

harupycommented, Nov 29, 2022

Where did you get this code?

client <- mlflow::mlflow_client(tracking_uri = "databricks")
experiment <- "1709256526326232"
run <- mlflow::mlflow_start_run(experiment_id = "1709256526326232", client = client)