Rework argument handling in astropy.utils.data functions that use download_file
See original GitHub issueFollow-up to #10434
There are several functions in astropy.utils.data
that wrap the download_file
function. Many of them take arguments that are passed to download_file
, but this is handled inconsistently.
For example:
-
get_readable_fileobj
largely copies the arguments todownload_file
, as well as their docstrings (unneccessary duplication). It could instead take some**kwargs
and document that they are passed todownload_file
. -
download_files_in_parallel
takes most of the same arguments asdownload_file
but not all -
get_pkg_data_filename
and friends usedownload_file
with some specific arguments, but it might be nice to be able to pass additionaldownload_file
arguments (such asallow_insecure
). However, accepting any arbitrary keyword arguments todownload_file
might not work here (for example these functions already take aremote_timeout
argument, where this is passed to the correspondingtimeout
argument ofdownload_file
).
It might be nice to brainstorm a plan to clean this up.
I think in most cases it’s simply a matter of my suggestion for get_readable_fileobj
: Allow it to take arbitrary keyword arguments, and document that they are passed to download_file
. I’m less sure what to do in cases like get_pkg_data_filename
.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (7 by maintainers)
Thank you for the detailed analysis! Here are my personal comments but then again I have been using these for a long time, so I am a hesitant to change stuff unless it is absolutely necessary or in the situation where long term benefit outweighs the cost of near term pain. Keep in mind that changing this API might break quite a few things downstream.
For me, I think it makes sense to name it only timeout in
download_data
. That function is all about downloading (remote) data, so having the keyword asremote_timeout
is redundant. However, in a function likeget_pkg_data_filename
, that file could be local (package data) or remote (from Astropy data server).More freedom, yes. But complete freedom, probably not. See my reasoning above.
I have a love-hate relationship with this one. It exists because of https://docs.astropy.org/en/latest/config/index.html . It is nice if you want to set a bunch of stuff for a system or a group of user. But it is also dangerous because that setting is hidden and you will find it surprising if you are not familiar with the config or if someone else set it for you silently. That is why we provide also the keyword to overwrite if you don’t care for some config that is set somewhere and you know what you want to set it to for that one call. So, delegating to
Conf
and only using that is probably a non-starter.data.py
is a low-level module, so OOP might seem to be overkill to me but I could be convinced otherwise.Thanks again for some really valuable feedback, @davidmpaz . I would bring this up in the next dev telecon and see what other maintainers think. 🙇♀️
One of the ideas was to brainstorming 😃 I am glad we are doing it now. I will love to implement whatever it is discussed at the end.