question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve running kedro as part of an automated workflow

See original GitHub issue

Description

When any kedro command is executed for the first time or in a clean environment (which is often the case in CI/CD) the telemetry prompt gets run. The user then has to answer Yes or No to running telemetry. If no user is involved, e.g. in an automated CI/CD workflow a hack needs to be put in place to programatically add a .telemetry file.

Possible Implementation

The above hack works, but isn’t well known. An alternative way of solving this issue is by running the command in a form like: yes | kedro new. We need to document how people can accept/deny the telemetry tracking automatically.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
deepyamancommented, Jun 22, 2022

In the past I think we’ve been regarding CI runs as contaminating our telemetry data, so probably it’s easiest and safest from the risk perspective to just disable telemetry in this case. But that does mean that in future if we do actually want to look at CI telemetry data we wouldn’t have anything available on it…

Another thought I had this morning was to enable telemetry by default in CI environments, but to be even more conservative in what’s collected by kedro-telemetry in those cases. Not sure that’s necessary, though; I feel like it’s quite conservative by default (but I’m also not that familiar with it).

I would advocate as capturing the CI variable in telemetry regardless of whether the default is to have it on or off.

0reactions
merelchtcommented, Aug 17, 2022

We discussed this issue in a Technical Design session and decided on the following:

  • We do want to capture telemetry for projects run on CI. This gives us insights into how many Kedro projects are run as part of an automated CI workflow, which in turn will help us understand how many Kedro projects are part of production systems.
  • Use the CI environment variable to indicate that data comes from a CI environment
  • The user still needs to be allowed to decide whether to run telemetry on their CI Kedro projects or not, so we will not implement a system to automatically track the data if it’s part of a CI run.
  • We need to document how a user can automatically give/deny consent to telemetry, by describing how to programatically create a .telemetry file
  • We want to implement some logic in the kedro new command that detects if it’s being run in CI and if so, whether consent has been given/not and if not then it will advice the user on how to do this.

To do:

  • Update kedro-telemetry to use the CI environment variable to indicate that data comes from a CI environment
  • Document how a user can automatically give/deny consent to telemetry, by describing how to programatically create a .telemetry file
  • Implement logic in the kedro new command that detects if it’s being run in CI and if so, whether consent has been given and if that’s not the case then it will advice the user on how to do this.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Frequently asked questions — Kedro 0.18.4 documentation
A data-driven framework makes pipelines easy, by permitting data versioning, incremental computing and automatic pipeline running order resolution.
Read more >
Run a pipeline — Kedro 0.18.4 documentation - Read the Docs
Use SequentialRunner to execute pipeline nodes one-by-one based on their dependencies. We recommend using SequentialRunner in cases where:.
Read more >
Automated Testing — Kedro 0.18.4 documentation
Manual testing is when you run part or all of your project and check that the results are what you expect. Automated testing...
Read more >
Single-machine deployment — Kedro 0.18.4 documentation
This workflow posits that development of the Kedro project is done on a local environment under version control by Git. Commits are pushed...
Read more >
Pipelines — Kedro 0.18.4 documentation - Read the Docs
We previously introduced Nodes as building blocks that represent tasks, and can be combined in a pipeline to build your workflow. A pipeline...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found