Proposal PIN: Remove ConstantTasks
See original GitHub issueStore Constant Values on Flow
Status
Proposed
Context
When a dependency is created between a task and any non-task value, Prefect automatically wraps the upstream value in a ConstantTask
and creates an appropriate edge. This “works” in the sense that it properly models a relationship between a computational node (the constant value) and a task that consumes it; but it adds noise to flow graphs that is confusing to users that don’t expect it. Furthermore, it adds edge-case potential for error in some circumstance where the constant task fails (for any reason including a node crashing) and results in a difficult-to-diagnose flow failure. In addition, it creates difficulties with triggers, as the ConstantTask
always succeeds.
Simply put: ConstantTasks are confusing and do not add information.
Proposal
ConstantTasks will no longer be automatically created (though they can stay in the task library, as they may be useful for particularly large objects that users only want to reference once in their flow).
Instead, if a task is attached to any constant (non-Task) value, that value is added to a new flow attribute, flow.constant_values
. This attribute is a dictionary Dict[Task, Dict[str, Any]]
that maps each task to a collection of input argument values. For example, if task t
is called in the functional API as t(x=1)
, then flow.constants[t]
would be updated to {t: {x: 1}}
. An additional “validation” step would ensure that the keys of flow.constants
were always a strict subset of flow.tasks
.
In addition, the TaskRunner.run()
method will accept a constant_inputs
kwarg and FlowRunner.run()
will read and provide the appropriate values for each task. The TaskRunner is responsible for combining those with the upstream state values in order to provide all input arguments for the task itself.
Consequences
ConstantTasks
will no longer be created (though they can remain in existence in case users find them useful for encapsulating some piece of information). Constants that are used in a flow will be stored in a nested dictionary attached to the flow itself.
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (7 by maintainers)
Expect, no.
Anticipate, maybe.
We’ve very successfully adhered to the “flowless task” model thus far and I don’t want to abandon it quite yet!
I’ve updated the PIN proposal to incorporate this discussion