[KED-1224] Connect Parameters and the Data Catalog
See original GitHub issueDescription
I have a list of annotation names that I’d like to be able to pass in to a DataSet constructor, as well as to particular pipeline nodes. There could easily be other cases where a node might need to know what parameter(s) a dataset was loaded with. You can always just duplicate the list in both yaml files, but it’s more ideal to have the parameter specified in only one place, especially it’s some parameter you can play around with.
For a more motivating use case, consider an mp4_file dataset where I have a frame rate I’d like to load the dataset with. So frame_rate
is one of the dataset’s arguments, and nodes might need to access the frame_rate
.
Possible Implementations
- Is it possible to use parameters in the same manner as the credentials.yml file within the catalog.yml file?
e.g.,
my_dataset.mp4:
filepath: /some/path.mp4
parameters: frame_rate
...
This would certainly get around the issue, and you can just inject the frame_rate parameter into the catalog entry by name.
-
Some extra pipeline node syntax like has been used for parameters, a la:
"my_dataset.mp4:frame_rate"
to connect the frame_rate parameter of a catalog entrymy_dataset.mp4
to a node. -
The answer I don’t want is that ‘you could just return a dict of metadata with your loaded dataset object’. It’s not too pretty and I don’t want to write every node to accept dictionary objects or tuples, making it awkward to treat them as functions later on.
edit: formatting
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:6 (5 by maintainers)
Top GitHub Comments
The updated link was broken again, here is one that worked for me today. 😃 TemplatedConfigLoader
I’m closing this issue. The suggested solution would be to utilise
TemplatedConfigLoader
as described above.