question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

execute_pipeline migration (dagster 0.13.4)

See original GitHub issue

Summary

The execute_pipeline method is not working on Dagster 0.13.2 when trying to run an op on a Databricks cluster.

Reproduction

Consider the following code sample, defining op “test_op”.

prod_adls2 = {
    "pyspark_step_launcher": databricks_pyspark_step_launcher,
    "pyspark": pyspark_resource,
    "adls2": adls2_resource,
    'io_manager': adls2_delta_io_manager,
    "adls_csv_loader" : adls_csv_loader,
    "database": adls_delta_resource,
}

@op(required_resource_keys={"pyspark_step_launcher", "pyspark"})
def test_op(context):
    context.log.debug('Op started')

On previous Dagster versions (at least until 0.12.2), the way to execute this op (solid) would be to wrap it in a pipeline and use execute_pipeline+reconstructable. execute_job(reconstructable(test_pipeline), mode="prod_adls2", run_config=run_config)

Nowadays (dagster 0.13.2), we can either define a job or a graph and then use the to_job method.

I tried the following approaches, which produced the corresponding errors.

@job(resource_defs=prod_adls2)
def test_job():
    test_op()

test_job.execute_in_process(run_config=config)

Error:

dagster.check.ParameterCheckError: Param “recon_pipeline” is not a ReconstructablePipeline. Got <dagster.core.definitions.pipeline_base.InMemoryPipeline object at 0x7f2f43fe10a0> which is type <class ‘dagster.core.definitions.pipeline_base.InMemoryPipeline’>

reconstructable(test_job).execute_in_process(run_config=config) or execute_pipeline(reconstructable(test_job), run_config=config)

Error:

dagster.core.errors.DagsterInvariantViolationError: Reconstructable target was not a function returning a job definition, or a job definition produced by a decorated function. If your job was constructed using GraphDefinition.to_job, you must wrap the to_job call in a function at module scope, ie not within any other functions. To learn more, check out the docs on reconstructable: https://docs.dagster.io/_apidocs/execution#dagster.reconstructable

reconstructable(make_test_job).execute_in_process(run_config=config)

Error:

AttributeError: ‘ReconstructablePipeline’ object has no attribute ‘execute_in_process’

execute_pipeline(reconstructable(make_test_job), run_config=config)

Error:

dagster.core.errors.DagsterUnmetExecutorRequirementsError: You have attempted to use an executor that uses multiple processes with an ephemeral DagsterInstance. A non-ephemeral instance is needed to coordinate execution between multiple processes. You can configure your default instance via $DAGSTER_HOME or ensure a valid one is passed when invoking the python APIs. You can learn more about setting up a persistent DagsterInstance from the DagsterInstance docs here: https://docs.dagster.io/deployment/dagster-instance#default-local-behavior

Am I missing the point or was execute_pipeline “mismigrated”?

Thanks!

Dagit UI/UX

Oddly enough, dagit does manage to run the job.

Environment

Python 3.9.5 dagster 0.13.2

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dpeng817commented, Jan 3, 2022

@bernardocortez yes, this has gone out!

1reaction
dpeng817commented, Nov 16, 2021

@bernardocortez thanks so much for posting the example. After further examination, this case is indeed a bug. Was able to put up this fix, which should go in by next release. Thanks again for surfacing!

Read more comments on GitHub >

github_iconTop Results From Across the Web

migration guide - Dagster Docs
Here is a guide to migrating from the legacy APIs to the stable APIs. ... This optional migration makes performance improvements to the...
Read more >
1.1.7 (core) / 0.17.7 (libraries) - Dagster Docs
Added a new CLI command dagster run migrate-repository which lets you migrate the run history for a given job from one repository to...
Read more >
Migrating to Ops, Jobs, and Graphs - Dagster Docs
Migrating a pipeline to jobs does not require migrating all your other pipelines to jobs. Graphs, jobs, and pipelines can co-exist peacefully in...
Read more >
Execution - Dagster Docs
Execute a job synchronously. This API represents dagster's python entrypoint for out-of-process execution. For most testing purposes, execute_in_process() will ...
Read more >
[Legacy] Pipelines - Dagster Docs
Create a pipeline with the specified parameters from the decorated composition function. Using this decorator allows you to build up the dependency graph...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found