Python func component renames function arguments
See original GitHub issueWhat steps did you take:
Converting a python func to compoent:
import kfp
def hydrate_schema(
synced_local_path: kfp.components.InputPath(str), data_schema: str
) -> str:
import re
if not synced_local_path.endswith("/"):
synced_local_path += "/"
return re.sub(r"s3:.+\/", synced_local_path, data_schema)
hydrate_schema_op = kfp.components.create_component_from_func(hydrate_schema, output_component_file="replace_schema.yaml")
In the generated component yaml
name: Hydrate schema
inputs:
- {name: synced_local, type: String}
- {name: data_schema, type: String}
outputs:
- {name: Output, type: String}
implementation:
container:
image: python:3.7
command:
- sh
- -ec
- |
program_path=$(mktemp)
echo -n "$0" > "$program_path"
python3 -u "$program_path" "$@"
- |
def hydrate_schema(
synced_local_path, data_schema
):
import re
if not synced_local_path.endswith("/"):
synced_local_path += "/"
return re.sub(r"s3:.+\/", synced_local_path, data_schema)
def _serialize_str(str_value: str) -> str:
if not isinstance(str_value, str):
raise TypeError('Value "{}" has type "{}" instead of str.'.format(str(str_value), str(type(str_value))))
return str_value
import argparse
_parser = argparse.ArgumentParser(prog='Hydrate schema', description='')
_parser.add_argument("--synced-local", dest="synced_local_path", type=str, required=True, default=argparse.SUPPRESS)
_parser.add_argument("--data-schema", dest="data_schema", type=str, required=True, default=argparse.SUPPRESS)
_parser.add_argument("----output-paths", dest="_output_paths", type=str, nargs=1)
_parsed_args = vars(_parser.parse_args())
_output_files = _parsed_args.pop("_output_paths", [])
_outputs = hydrate_schema(**_parsed_args)
_outputs = [_outputs]
_output_serializers = [
_serialize_str,
]
import os
for idx, output_file in enumerate(_output_files):
try:
os.makedirs(os.path.dirname(output_file))
except OSError:
pass
with open(output_file, 'w') as f:
f.write(_output_serializers[idx](_outputs[idx]))
args:
- --synced-local
- {inputPath: synced_local}
- --data-schema
- {inputValue: data_schema}
- '----output-paths'
- {outputPath: Output}
The argument synced_local_path
is replaced with synced_local
.
Therefore, when using the component as
local_data_schema = hydrate_schema_op(
data_schema=data_schema, synced_local_path=data_sync.output
)
Compiler complains that
TypeError: Hydrate schema() got an unexpected keyword argument 'synced_local_path'
What happened:
The python func argument synced_local_path
is renamed to synced_local
What did you expect to happen:
The argument names are preserved after conversion.
Environment:
How did you deploy Kubeflow Pipelines (KFP)?
Local cluster deployment using Kind
KFP version: 1.2.0
KFP SDK version: 1.2.0
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
/kind bug
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Python - Rename function returned by other function
Save this question. Show activity on this post. I created a method create_function which returns another function with modified behaviour based ...
Read more >Renaming Python's Functions — AP CSP - Teacher
Renaming Python's Functions ¶. The functions abs and int are names. They are variables whose values are a set of statements that achieve...
Read more >How can I rename a function? - Python FAQ
Given the def word defines a function, yes, rename is a function. The only way to rename a function is to change the...
Read more >How to Rename Files in Python with os.rename() - Datagy
rename () function accepts two required arguments: the original, source file path and the destination file path. Enter the source file path into ......
Read more >os — Miscellaneous operating system interfaces — Python ...
All functions accepting path or file names accept both bytes and string objects, and result in an object of the same type, if...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
There seems to be an error in your function definition. I’m not fully sure what is the behavior you’ve intended, but you probably should not be using the
InputPath
annotation given your component function code: Just usedef hydrate_schema(synced_local_path: str
. Or better maybe even call itsynced_uri
, since it seems to be a URI not local path.Can you try this solution and tell us whether it has fixed your problem?
P.S. If you tried to call your component, you’d have noticed that
synced_local_path
does not really receive what you expected. It would have been a real local path (/tmp/inputs/synced_local/data
) containing whatever data the component received from the upstream (probably a URI). This is whatInputPath
does.Long explanation:
The behavior is: When using create_component_from_func/func_to_container_op: When a function parameter uses
InputPath
orOutputPath
annotation and the parameter name ends with_path
or_file
, that part is stripped when generating the input/output name.Let me try to explain why this design was chosen.
When you use
create_component_from_func
, there are two separate architecture layers: component layer and python function layer. On the pipeline level, the author passes artifacts between components. The pipeline author does not manually pass URIs or local paths. Instead they just connect outputs to inputs. However you function has slightly different interface and gets some data from local files, a concept not existing for the pipeline authors.create_component_from_func
generates glue command-line program code to bridge between those layers. Annotations likeInputPath
andOutputPath
influence the way that bridge is constructed.InputPath
means “write the passed artifact contents to a local file and give me path to that file instead of the content itself”. When the component receives a “Dataset” (big text file in CSV format), you function receives a “Dataset path” (a small string with local path). These are very different kinds of objects, so it’s natural that the names are different.This difference becomes especially apparent if you consider numbers: Notice how the function expects a string path, but the component input has type
Integer
Function:Component:
Pipeline:
Observe the flow:
42
to the inputNumber
usingnumber=42
/tmp/outputs/Number/data
)--number-path
/tmp/outputs/Number/data
number_path="/tmp/outputs/Number/data"
If the
create_component_from_func
did not strip_path
when naming the inputs, this would look wrong and weird for the pipeline author:number_path=42
looks wrong since 42 is not a valid path - it’s an integer.This issue has been automatically closed because it has not had recent activity. Please comment “/reopen” to reopen it.