question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Defer environment installation if hook would be skipped

See original GitHub issue

Rationale

When running pre-commit run or during git commit, all the virtual environments are initialized (fast) and dependencies for them are installed. The latter can be slow depending on many factors (number of dependencies, whether they are cached already, etc.).

pre-commit has logic to decide if certain hooks should be invoked at all and if there are no matching files the hook would be run on, it is “(no files to check)Skipped”. However, even when this is the case, the hook’s environment still would be installed.

Possible implementation

This could be improved by leveraging the Classifier to check if there would be modified files at all just like in _run_single_hook, and if not, don’t just skip running the hook but also skip installing its environment.

Our use case

Click for details....

Many monorepos, where hooks are possibly defined to run on files in certain ^subdirectory/.*$s would greatly benefit from this. In our case, many C++ developers only work in C+±related subdirectories; however, although they would never touch python files, if another, python-related hook’s dependencies change, the other hook’s python virtualenv is recreated.

Currently, even with cached packages, this recreation (just installing packages with pip) takes 2-3 minutes on modern hardware, multiplied by the number of hooks whose additional_dependencies changed. In our repository, one of these virtualenvs takes around 8+ minute to install (from already cached wheels). This is acceptable for the python developers (after all they are changing code which is expected to be checked), however not for the rest of the team.

Example

$ export XDG_CACHE_HOME=/tmp/.PRECOMMITCACHE/
$ tree
$ .
├── project-with-arrow
│   └── test.py
└── project-with-hdfs
    └── test.py
$ cat .pre-commit-config.yaml
repos:
  - repo: https://github.com/PyCQA/pylint
    rev: v2.14.5
    hooks:
      - name: pylint-arrow
        alias: pylint-arrow
        id: pylint
        additional_dependencies:
          - arrow
        files: ^project-with-arrow/.*
        args: ["--disable=W,C,R"]
      - name: pylint-hdfs
        alias: pylint-hdfs
        id: pylint
        additional_dependencies:
          - hdfs
        files: ^project-with-hdfs/.*
        args: ["--disable=W,C,R"]
$ 
$ ####################
$ # Current behavior #
$ ####################
$ pip install --force-reinstall pre-commit
...
Successfully installed ... pre-commit-2.20.0
$ rm -rf $XDG_CACHE_HOME
$ echo "# Let's modify project-with-arrow!" >> project-with-arrow/test.py
$ git commit -a -m "Modified arrow. During commit, both environment will be installed, but only one will be used"
[INFO] Initializing environment for https://github.com/PyCQA/pylint.
[INFO] Initializing environment for https://github.com/PyCQA/pylint:arrow.
[INFO] Initializing environment for https://github.com/PyCQA/pylint:hdfs.
[INFO] Installing environment for https://github.com/PyCQA/pylint. # <---------------- Two envs installed
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/PyCQA/pylint. # <---------------- Two envs installed
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
pylint-arrow.............................................................Passed
pylint-hdfs..........................................(no files to check)Skipped  # <-- One env used
[master bd2d394] Modified arrow. During commit, both environment will be installed, but only one will be used
 1 file changed, 1 insertion(+)
$
$ #####################
$ # Proposed behavior #
$ #####################
$ git reset --hard HEAD~
$ pip install --force-reinstall git+https://github.com/imc-trading/pre-commit.git@imc
...
Successfully installed ... pre-commit-2.20.1
$ rm -rf $XDG_CACHE_HOME
$ echo "# Let's modify again project-with-arrow!" >> project-with-arrow/test.py
$ git commit -a -m "Modified arrow. During commit, only one environment will be installed"
[INFO] Initializing environment for https://github.com/PyCQA/pylint.
[INFO] Initializing environment for https://github.com/PyCQA/pylint:arrow.
[INFO] Initializing environment for https://github.com/PyCQA/pylint:hdfs.
[INFO] Installing environment for https://github.com/PyCQA/pylint. # <---------------- Only one env installed
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
pylint-arrow.............................................................Passed
pylint-hdfs..........................................(no files to check)Skipped
[master ba4bdd5] Modified arrow. During commit, only one environment will be installed
 1 file changed, 1 insertion(+)

A PoC quick&dirty implementation is available here.

Workarounds

  • When running manually pre-commit run <hook>, only given hooks’ environments are installed
  • specifying hooks in SKIP

Searched:

  • All currently open issues
  • subdirectory OR subfolder
  • defer
  • monorepo

Related:

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
asottilecommented, Aug 12, 2022

--from-ref / --to-ref is just a shortcut to select the file list, it doesn’t change anything about the execution

0reactions
kmARCcommented, Aug 12, 2022

yes, but SKIP is an intentional decision rather than an implicit behaviour – therefore “our compromise there”

Oh, that’s surprising.

I would expect that SKIPing on the developer’s machine is fine, as in, good enough compromise, since on CI it would check everything anyway. But then the more important question is, can this false positive happen during pre-commit run --from-ref ... --to-ref ...?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Defer environment installation if hook would be skipped issue ...
Rationale. When running pre-commit run or during git commit , all the virtual environments are initialized (fast) and dependencies for them are installed....
Read more >
SKIP still attempts to install hook · Issue #1689 · pre-commit ...
According to https://pre-commit.com/#temporarily-disabling-hooks: The SKIP environment variable is a comma separated list of hook ids.
Read more >
Bug #1937307 “After a machine reboot, enable-auto-restars still ...
... message "Hooks skipped due to disabled auto restarts: configure_ovs, install". As a workaround, running the "run-deferred-hooks" action resolved this, ...
Read more >
Sync Phases and Waves - Argo CD - Read the Docs
Hooks and resources are assigned to wave zero by default. The wave can be negative, so you can create a wave that runs...
Read more >
Hooks API Reference - React
If your update function returns the exact same value as the current state, the subsequent rerender will be skipped completely. Note. Unlike the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found