Decrease import time of affiliated packages
See original GitHub issueI’ve noticed that astropy-affiliated packages take a long time to import. For example, from astroquery.heasarc import Heasarc
takes 0.9 seconds on my laptop, and import pyvo
takes 1.1 seconds. That’s a little longer than it takes to import ALL of numpy, scipy, matplotlib, astropy, pandas, sklearn, bokeh, and PySide6.
This can be rather frustrating for certain users (e.g., when these packages are embedded into GUIs where the end-user doesn’t know that there’s a background Python process importing packages which is what’s causing the delay to the application startup).
I looked into why this might be, and I found a package named Tuna (https://github.com/nschloe/tuna; see how to use it for examining imports at https://stackoverflow.com/a/51300944). Here are the results for running the two import statements above: tuna - astroquery.log.pdf tuna - pyvo.log.pdf.
The biggest single-item contributions to the long import time that I see is that 0.3 seconds is spent importing Dask and 0.17 seconds is spent importing Astropy’s TestRunner. I suspect that both of these are unnecessary for most astropy-affiliated packages, and removing them from imports might not be too hard.
Regarding TestRunner:
Affiliated packages are required to have the line from ._astropy_init import *
in their init.py file (ex. https://github.com/astropy/astroquery/blob/main/astroquery/__init__.py and https://github.com/astropy/pyvo/blob/main/pyvo/__init__.py). _astropy_init.py has the line astropy.tests.runner import TestRunner
. If this is only used for developer testing, I don’t think this needs to be imported by users.
Regarding Dask: I’m not so sure what exactly is going on here. Is Dask actually need for all applications? Generally the FITS files that I’m working with are small, so if Dask is just being used to speed up file reads, I’d be much happier sacrificing the <0.01 seconds that I’m expecting Dask is contributing to reading a FITS file to get rid of the 0.3 seconds that it takes for the import. Is there a way to make this import optional?
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (5 by maintainers)
Top GitHub Comments
Re:
TestRunner
Ah, I thought that sincefrom ._astropy_init import *
was right under the comment# Affiliated packages may add whatever they like to this file, but should keep this content at the top.
that that import was required of all affiliated packages. Is that not correct? I see it in MOST of the affiliated packages (just looking at a random 3, of https://github.com/karllark/dust_extinction/blob/master/dust_extinction/__init__.py, https://github.com/aplpy/aplpy/blob/master/aplpy/__init__.py, and https://github.com/cosimoNigro/agnpy/blob/master/agnpy/__init__.py, 2 of the 3 of them have this import, withTestRunner
then being imported in _astropy_init.py. So I guess that since not ALL of them have it, it’s not technically necessary. Anyway, I’ll post this on the issue lists of astroquery and pyvo.I’m afraid #12806 is having the same issue as #12805 was: it’s still not decreasing the time to run
from astroquery.heasarc import Heasarc
. Thefrom dask.array import Array as DaskArray
statement in astropy/io/fits/hdu/image.py is still an issue.