[BUG] unable to pass artifacts between steps in multistep workflow
See original GitHub issueWillingness to contribute
No. I cannot contribute a bug fix at this time.
MLflow version
1.27.0
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
- Python version: 3.10.4
- yarn version, if running the dev UI:
Describe the problem
I’m trying to run a multistep mlflow project. When trying to use artifacts generated in an earlier step and stored in an S3 bucket in a subsequent step, the files cannot be loaded due to the fact that the file path argument to the artifacts are passed in nested quotes. It causes the following error: OSError: [Errno 22] Invalid argument. I would expect a normal file path to be passed to the downstream step so the file can be read in and used.
Tracking information
No response
Code to reproduce issue
import mlflow
import argparse
parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
parser.add_argument("--fully_preprocessed_train_data", type=str, default="fully_preprocessed_train_data.pkl")
parser.add_argument("--fully_preprocessed_test_data", type=str, default="fully_preprocessed_test_data.pkl")
parser.add_argument("--fully_preprocessed_oot_data", type=str, default="fully_preprocessed_oot_data.pkl")
args = parser.parse_args()
preprocessing_params = {raw_data: raw_data_filepath}
train_params = {}
def workflow():
with mlflow.start_run(description="Running workflow") as run:
print("Performing custom preprocessing")
custom_preprocessing_run = mlflow.run(".", "custom_preprocess", parameters=preprocessing_params)
custom_preprocessing_runid = mlflow.tracking.MlflowClient().get_run(custom_preprocessing_run.run_id)
data_path_uri = custom_preprocessing_runid.info.artifact_uri
print(f"Sourcing training data from {data_path_uri}")
mlflow.set_tag(key="custom_preprocess_runid", value=custom_preprocessing_runid)
train_params["preprocessed_train_data_path"] = os.path.join(data_path_uri, args.fully_preprocessed_train_data).replace("\\",
"/")
train_params["preprocessed_test_data_path"] = os.path.join(data_path_uri, args.fully_preprocessed_test_data).replace("\\",
"/")
train_params["preprocessed_oot_data_path"] = os.path.join(data_path_uri, args.fully_preprocessed_oot_data).replace("\\",
"/")
print("Launching training script")
train_run = mlflow.run(".", "train", parameters=train_params)
if __name__ == "__main__":
workflow()
Stack trace
Launching training script
2022/08/23 11:58:40 INFO mlflow.utils.conda: Conda environment mlflow-f9436806216344de765e4a9ad154d6365504135b already exists.
2022/08/23 11:58:40 INFO mlflow.projects.utils: === Created directory C:\Users\NICOLE~1\AppData\Local\Temp\tmp23zpl4w1 for downloading remote URIs passed to arguments of type 'path' ===
2022/08/23 11:59:05 INFO mlflow.projects.backend.local: === Running command 'conda activate mlflow-f9436806216344de765e4a9ad154d6365504135b && python train.py --preprocessed_train_data_path 'C:\Users\NICOLE~1.FAR\AppData\Local\Temp\tmp23zpl4w1\param_1\fully_preprocessed_train_data.pkl' --preprocessed_test_data_path 'C:\Users\NICOLE~1\AppData\Local\Temp\tmp23zpl4w1\param_2\fully_preprocessed_test_data.pkl' --preprocessed_oot_data_path 'C:\Users\NICOLE~1\AppData\Local\Temp\tmp23zpl4w1\param_3\fully_preprocessed_oot_data.pkl' in run with ID '8931f30fbffe411696b49dd67d77596d' ===
Traceback (most recent call last):
File "C:\Users\nicole\Desktop\repos\data-science-projects\multistep_workflow\train.py", line 252, in <module>
train()
File "C:\Users\nicole\Desktop\repos\data-science-projects\multistep_workflow\train.py", line 80, in train
train_trans = pd.read_pickle(args.preprocessed_train_data_path)
File "C:\Users\nicole\Miniconda3\envs\mlflow-f9436806216344de765e4a9ad154d6365504135b\lib\site-packages\pandas\io\pickle.py", line 187, in read_pickle
with get_handle(
File "C:\Users\nicole\Miniconda3\envs\mlflow-f9436806216344de765e4a9ad154d6365504135b\lib\site-packages\pandas\io\common.py", line 798, in get_handle
handle = open(handle, ioargs.mode)
OSError: [Errno 22] Invalid argument: "'C:\\Users\\NICOLE~1\\AppData\\Local\\Temp\\tmp23zpl4w1\\param_1\\fully_preprocessed_train_data.pkl'"
Traceback (most recent call last):
File "C:\Users\nicole\Desktop\repos\data-science-projects\pti_multistep_workflow\main.py", line 98, in <module>
workflow()
File "C:\Users\nicole\Desktop\repos\data-science-projects\pti_multistep_workflow\main.py", line 93, in workflow
train_run = mlflow.run(".", "train", parameters=train_params)
File "C:\Users\nicole\Miniconda3\envs\mlflow-f9436806216344de765e4a9ad154d6365504135b\lib\site-packages\mlflow\projects\__init__.py", line 346, in run
_wait_for(submitted_run_obj)
File "C:\Users\nicole\Miniconda3\envs\mlflow-f9436806216344de765e4a9ad154d6365504135b\lib\site-packages\mlflow\projects\__init__.py", line 363, in _wait_for
raise ExecutionException("Run (ID '%s') failed" % run_id)
mlflow.exceptions.ExecutionException: Run (ID '8931f30fbffe411696b49dd67d77596d') failed
2022/08/23 11:59:32 ERROR mlflow.cli: === Run (ID '12d3ed0d7c3d47eb9a200c9f2ceff20c')
failed ===
Other info / logs
No response
What component(s) does this bug affect?
-
area/artifacts
: Artifact stores and artifact logging -
area/build
: Build and test infrastructure for MLflow -
area/docs
: MLflow documentation pages -
area/examples
: Example code -
area/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models
: MLmodel format, model serialization/deserialization, flavors -
area/pipelines
: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates -
area/projects
: MLproject format, project running backends -
area/scoring
: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra
: MLflow Tracking server backend -
area/tracking
: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
-
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker
: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows
: Windows support
What language(s) does this bug affect?
-
language/r
: R APIs and clients -
language/java
: Java APIs and clients -
language/new
: Proposals for new client languages
What integration(s) does this bug affect?
-
integrations/azure
: Azure and Azure ML integrations -
integrations/sagemaker
: SageMaker integrations -
integrations/databricks
: Databricks integrations
Issue Analytics
- State:
- Created a year ago
- Comments:14 (11 by maintainers)
Top Results From Across the Web
Artifacts are not passed between dependent stages - GitLab
I have a pipeline comprised of two stages (build and deliver), each of which has a single job. The build stage builds a...
Read more >How to publish GitHub Actions artifacts by example | TechTarget
Here's a simple example on how a developer can publish GitHub Actions artifacts for download once a build workflow successfully completes.
Read more >Multi-stage pipelines fail with artifacts - Visual Studio Feedback
My multi-stage pipeline fails with the unexpected error message: "##[error]Could not find any pipeline artifacts in the build.
Read more >Workflow Templates - The workflow engine for Kubernetes
This template can be of type container , script , dag , steps , resource , or suspend and can be referenced by...
Read more >In a github actions workflow, is there a way to have multiple ...
In your case if the duplicated steps are in the single workflow you also can: extract them to the "preparation" job; upload build...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@nfarley-soaren The approach in this StackOverflow answer might be able to solve the issue. I’m testing it.
Output:
@harupy That worked! Thank you!