question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[backend] How to specify python files to include using create_component_from_func

See original GitHub issue

I have a deployment like the following:

root_dir
├── root.py
├── same.yaml
├── same_step_0.py
├── same_step_1.py
├── same_step_2.py
└── utils
    ├── __init__.py
    ├── file_one.py
    └── file_two.py

The root.py file is the standard kubeflow root file, and then all the steps have functions in them that are handled with create_component_from_func - like this:

    same_step_0_op = create_component_from_func(
        func=same_step_0.generated_main,
        base_image="library/python:3.9-slim-buster",
        packages_to_install=[
            "dill",
            "requests",
            "numpy==1.19.5",
            "requests",
        ],
    )

The file same_step_0.py references a function in utils.file_one - but when building the package and uploading, it’s not being included. It does appear (AFAICT) that the functionality to copy all dependent files is there - https://github.com/kubeflow/pipelines/blob/cc83e1089b573256e781ed2e4ac90f604129e769/sdk/python/kfp/containers/_component_builder.py#L222 - is there a special flag I need to set?

Thanks!


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
Ark-kuncommented, Jul 8, 2021

There are several possible solutions to this.

First thing to understand is how the create_component_from_func function works. This function realizes the “Lightweight python components” feature. All KFP components are ComponentSpec files (component.yaml). Many people write them manually and that’s fine. But for python functions I’ve added a way to generate the component.yaml from the function signature and code. The resulting components are “Lightweight” is sense that there is no need to build and push a new container. The code is included in the command-line. This makes development much easier, but also limits the amount of code that can be included.

So, what can be done:

  1. You can put the utils in the container and use it as base image. The kfp.container.build_image_from_working_dir can help you with that. See the sample

  2. Put utils into its own package that can be installed via packages_to_install. BTW, python can install from GIT.

  3. Use code pickling to include extra dependencies in the lightweight component. This option (use_code_pickling) is not exposed in create_component_from_func, but the older func_to_container_op still has it. You need to be very careful to use the same python version in your environment and in the container.

  4. There is a passive idea to add a new feature - additional_files to the create_component_from_func function, but I think it stretches the idea of " code inlining" too far.

same_step_0.generated_main

If you’re generating the main function yourself, maybe you want to generate the whole kfp.components.structures.ComponentSpec/component.yaml?

0reactions
stale[bot]commented, Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Packaging Python Projects
This tutorial walks you through how to package a simple Python project. It will show you how to add the necessary files and...
Read more >
Including files in source distributions with MANIFEST.in
The following files are included in a source distribution by default: all Python source files implied by the py_modules and packages setup() arguments....
Read more >
3. Configure Python — Python 3.11.1 documentation
Use editline library for backend of the readline module. Define the WITH_EDITLINE macro. New in version 3.10. --without-readline ...
Read more >
1. Embedding Python in Another Application ... - Python Docs
This can for example be used to perform some operation on a file. #define PY_SSIZE_T_CLEAN #include <Python.h> int main(int argc, char *argv[]) {...
Read more >
2. Writing the Setup Script — Python 3.11.1 documentation
If you, for example, use standard Python functions such as glob.glob() or os.listdir() to specify files, you should be careful to write portable...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found