Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Run `numba` compilation as a post-install step?

See original GitHub issue

Currently, this is the import performance of sgkit on a fresh install:

(testenv) benj@benj-X580VD:~$ time python -c "import sgkit"

real	0m16.715s
user	0m16.149s
sys	0m0.987s
(testenv) benj@benj-X580VD:~$ time python -c "import sgkit"

real	0m2.184s
user	0m2.414s
sys	0m0.914s

It might be possible to move the 14s of numba compilation to a post-install step, hopefully on pip and conda.

Issue Analytics

State:
Created a year ago
Comments:6 (2 by maintainers)

Top GitHub Comments

2reactions

tomwhitecommented, Nov 15, 2022

I had another look at this. The numba compilation is actually from guvectorize, not jit, since the latter is lazy. I don’t think guvectorize can be made lazy since you have to specify the signatures up front.

I think the simplest way forward would be to move all the guvectorize functions to separate modules and then import them from the functions that need them. This works since none of the guvectorize functions are in the public API.

So for popgen.py, for example, we could move a function like _divergence which is decorated with numba_guvectorize to popgen_accelerate.py (or similar), and then the public divergence function would import _divergence in the function body. Compilation would happen the first time divergence is called (not imported).

1reaction

tomwhitecommented, Oct 27, 2022

Another thing to try might be lazy compilation instead of the eager compilation (with types specified) that we use everywhere.

Top Results From Across the Web

A ~5 minute guide to Numba — Numba 0.50.1 documentation

In this mode Numba will identify loops that it can compile and compile those into functions that run in machine code, and it...

Compiling Python code with @jit - Numba documentation

Code running with the GIL released runs concurrently with other threads executing Python or Numba code (either the same compiled function, or another...

What is Numba? | Data Science | NVIDIA Glossary

Numba is an open-source, just-in-time compiler for Python code that developers ... replacing the Python interpreter, running a separate compilation step, ...

Supercharging NumPy with Numba - Towards Data Science

The compilation of a function happens separately before code execution producing an on-disk binary object. Loop Jitting A subset of the function ...

basics - Jupyter Notebook - MyBinder

Numba is a just-in-time compiler of Python functions. It translates a Python function when it is called into a machine code equivalent that...