Nesting @graphs does not properly namespace op names
See original GitHub issueSummary
When composing graphs, the names of the ops contained in those graphs are not properly namespaced. This is particularly noticeable when using the factory function pattern which creates ops with the same names.
Consider a factory function make_table_loader(table_name, op_name="default_name")
that one would like to reuse in many graphs.
If calling this factory multiple times within the same graph without passing a new op_name, this should fail (and it does) due to a naming conflict at the same level.
However, if calling in different graphs, one should be able to reuse the name, even if the created op is different. There shouldn’t be any naming conflict, but in master this fails.
This behavior is inconsistent with the documentation (see https://github.com/dagster-io/dagster/issues/8013) but I also believe it to be a bug left over from the conversion from pipelines to graphs. It looks like this is also inconsistent with how get_output_for_handle
is supposed to work here https://docs.dagster.io/guides/dagster/graph_job_op#a-simple-composite-solid .
Reproduction
from dagster import solid, graph, GraphOut, Permissive
def test_factory_composition_bug():
# make sure that namespaced conflicts DO work
def make_simple(name):
@op(config_schema={"payload": Permissive()}, name=name)
def fn(context):
return context.op_config["payload"]
return fn
@graph()
def wrapped_simple():
return make_simple("simple")()
expected_payload = {"key": ["hello", "there"]}
result = wrapped_simple.execute_in_process(
run_config={"ops": {"simple": {"config": {"payload": expected_payload}}}}
)
# default execution works
assert result.success
assert result.output_value() == expected_payload
# what if we use a factory function twice, but with the same name?
@graph()
def wrapped_simple_extra():
return make_simple("simple")()
@graph(out={"val1": GraphOut(), "val2": GraphOut()})
def wrap_all_simples():
return {"val1": wrapped_simple(), "val2": wrapped_simple_extra()}
# how about nested?
result2 = wrap_all_simples.execute_in_process(
run_config={
"ops": {
"wrapped_simple": {"ops": {"simple": {"config": {"payload": expected_payload}}}},
"wrapped_simple_extra": {
"ops": {"simple": {"config": {"payload": expected_payload}}}
},
}
}
)
assert result2.success
assert result2.output_value("val1") == expected_payload
assert result2.output_value("val2") == expected_payload
Dagit UI/UX Issue Screenshots
Additional Info about Your Environment
Message from the maintainers:
Impacted by this bug? Give it a 👍. We factor engagement into prioritization.
Issue Analytics
- State:
- Created a year ago
- Reactions:7
- Comments:8 (2 by maintainers)
Top GitHub Comments
Appreciate the follow-up discussion, definitely think this is something we should revisit and improve.
I initially flagged the documentation bug since it didn’t match with what I saw in Dagster, but for normal usage I’d expect namespacing within the DAG/Graph over namespacing within the complete the Repository.
Use-cases and ways of working will differ and some users will work together when they are using a Repository and try to share things, but others might not and then someone’s DAG/Graph with an Op called “abc” will cause collisions with someone else’s DAG/Graph with an Op with the same name. Now depending on who was first the other user will get a failure, which I’d expect will result in confusion for at least some users. Also because of this these users are now hindered in their autonomous operation, they now need to be aware of what other people are doing to not cause any collisions. And I guess you could argue naming things is already hard enough, having to also consider the scope outside of the DAG/Graph just makes it harder.
[edit] Wanted to add that I do see that Ops (definitions) are something different than what I’m used to coming from Airflow Tasks and there are some advantages to them being an actual entity that exists and is stored within Dagster, allowing thing like being able to easily see in which Graphs/Assets an Op is used. I’m not sure it’s worth it though compared to the issues I mentioned above.