Wrong input artifacts path in yaml compiled by python SDK
See original GitHub issue/kind bug
What steps did you take and what happened:
Input artifacts path is wrong when input artifacts are from 2 different parent tasks.
What did you expect to happen:
Input artifacts path should be in correct format, and be accessible.
Additional information: [Miscellaneous information that will assist in solving the issue.]
get-subdirectory
is the common parent task for list-items
and list-items-2
. I suppose the input artifact path of get-subdirectory should be same for both downstream tasks, but they are different.
path of list-items: $(workspaces.list-items.path)/get-subdirectory-Subdir
path of list-items-2: /tmp/inputs/Directory1/data
And list-items-2
cannot get the input artifacts from the path, and the error log is ls: /tmp/inputs/Directory1/data: No such file or directory
Here is the codes to generate the pipeline
filesystem_component_root = 'components/filesystem'
get_subdir_op = kfp.components.load_component_from_file(os.path.join(filesystem_component_root, 'get_subdirectory/component.yaml'))
list_item_op_1 = kfp.components.load_component_from_file(os.path.join(filesystem_component_root, 'list_items/component.yaml'))
list_item_op_2 = kfp.components.load_component_from_file('jizg/component.yaml')
def test_pipeline():
@create_component_from_func
def produce_dir_with_files_python_op(output_dir_path: OutputPath(), num_files: int = 10):
import os
os.makedirs(os.path.join(output_dir_path, 'subdir'), exist_ok=True)
for i in range(num_files):
file_path = os.path.join(output_dir_path, 'subdir', str(i) + '.txt')
with open(file_path, 'w') as f:
f.write(str(i))
produce_dir_python_task = produce_dir_with_files_python_op(num_files=15)
get_subdir_task = get_subdir_op(
# Input name "Input 1" is converted to pythonic parameter name "input_1"
directory=produce_dir_python_task.output,
subpath="subdir"
)
list_items_task_1 = list_item_op_1(
# Input name "Input 1" is converted to pythonic parameter name "input_1"
directory=get_subdir_task.output
)
list_items_task_2 = list_item_op_2(
# Input name "Input 1" is converted to pythonic parameter name "input_1"
directory1=get_subdir_task.output,
directory2=produce_dir_python_task.output
)
jizg/component.yaml
name: List items 2
description: Recursively list directory contents.
inputs:
- {name: Directory1, type: Directory}
- {name: Directory2, type: Directory}
outputs:
- {name: Items}
implementation:
container:
image: alpine
command:
- sh
- -ex
- -c
- |
mkdir -p "$(dirname "$2")"
ls -A -R "$0" > "$2"
ls -A -R "$1" >> "$2"
- inputPath: Directory1
- inputPath: Directory2
- outputPath: Items
https://github.com/kubeflow/kfp-tekton/blob/master/components/filesystem/get_subdirectory/component.yaml https://github.com/kubeflow/kfp-tekton/blob/master/components/filesystem/list_items/component.yaml
Environment:
- Python Version (use
python --version
): 3.7.8 - SDK Version: 0.3.0
- Tekton Version (use
tkn version
): not installed - Kubernetes Version (use
kubectl version
): 1.16.15 - OS (e.g. from
/etc/os-release
): Ubuntu 18.04
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top GitHub Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! Links: app homepage, dashboard and code for this bot.
This is a bug of big data passing, as Tommy mentioned, will fix it.