question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allow user to specify `index-url` during `kedro pipeline pull`

See original GitHub issue

Description

Hey team.

Say I want to build up a collection of reusable pipelines. I decide to use a python package index as the distribution mechanism because I really like the versioning capabilities and how it integrates with my CI pipeline. But I don’t want to put my pipelines on PiPy. So I spin up a private python package repo and add its url under extra-index-url in my pip.conf. So far so good.

But the problem is that because pip doesn’t have a notion of index priority now the names of my pipelines potentially clash with all package names on PiPy. This is at best confusing and at worst a security nightmare.

If I was dealing with python packages, where one package depends on many others, I would either have to live with this or spin up something like devpi. But the cool thing about kedro pipelines is that I shouldn’t need to do that since kedro pipeline pull doesn’t require access to PiPy at all (IIUC). So what I’d love to do is restrict the pull command to my private index, only (no PiPy).

Possible Implementation

Since kedro pipeline pull uses pip download under the hood, I should be able to achieve what I want to do using pip’s --index-url argument. (I can already do this today by adding it to my global pip.conf but this messes with my ability to download other python packages so I don’t want to do that).

I can think of three ways this might work:

  1. use shell=True in the subprocess.run call, here so that I can set PIP_CONFIG_FILE in my environment and it gets passed on to the pip download call
  2. explicitly copy PIP_CONFIG_FILE from the environment that executes kedro pipeline pull to the subprocess.run call (env=...)
  3. add an --index url argument to kedro pipeline pull and pass it through to pip download

I’m sure there’s other solutions, too. Maybe there’s one that already works with the current version?

Thanks for all your hard work!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
datajoelycommented, Oct 1, 2021

This is on the roadmap 😃 stay tuned

0reactions
merelchtcommented, May 16, 2022

Closing this for now. But feel free to re-open if this is still needed (cc: @willashford)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Contribute to the Kedro documentation - Read the Docs
All Kedro documentation is collated and built from a single index file, index.rst found in the docs/source folder. If you add extra pages...
Read more >
Kedro's command line interface - Read the Docs
Kedro's command line interface (CLI) is used to give commands to Kedro via a terminal shell (such as the terminal app on macOS,...
Read more >
Kedro concepts — Kedro 0.18.4 documentation - Read the Docs
The Kedro Data Catalog is the registry of all data sources that the project can use to manage loading and saving data. It...
Read more >
Micro-packaging — Kedro 0.18.4 documentation
Micro-packaging allows users to share Kedro micro-packages across codebases, organisations and beyond. A micro-package can be any part of Python code in a ......
Read more >
Kedro 0.18.4 documentation - Read the Docs
Welcome to Kedro's documentation!¶ · Use the Data Catalog within Kedro configuration · Specify the location of the dataset · Data Catalog *_args...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found