Allow different stages of a pipeline to run on differently-sized containers
See original GitHub issueIn the pipeline in the covid-notebooks
project (https://github.com/CODAIT/covid-notebooks/blob/master/pipelines/us_data.pipeline), one of the notebooks (fit_us_data.ipynb
) takes considerably longer than the rest because it performs 10,000 curve-fitting operations serially, one after the other. I would like to make this notebook run faster by performing the curve-fitting operations in parallel.
This change will require having the fit_us_data.ipynb
notebook run on a larger container with more cores. I would like for the remaining notebooks in the pipeline to continue running on smaller containers.
Currently there is no way to specify in the pipeline editor UI that different stages of the pipeline should execute on containers with different amounts of CPU and/or memory. Containers inherit whatever is the default container size for the KFP cluster.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
scratch that^^. lets stick with our original plan to plumb 3 new fields, CPU, Memory and GPU.
@marthacryan - we will probably need two fields, one for the container size(sm,med,lg) and one for number of gpus, since not all workloads will require gpus and it doesnt make sense to include gpu resources in with container sizing