question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Test failure in astropy/utils/test_data.py with remote_data=True: test_download_parallel

See original GitHub issue

In an install in a clean virtualenv, astropy.test(test_path="astropy/utils/tests/test_data.py", remote_data=True) fails. Short version of failure message:

_gdbm.error: [Errno 11] Resource temporarily unavailable

Running astropy.test() completes fine. This is in a minimal python setup.py develop context. Longer details follow.

$ pip list
Package            Version      Location                                
------------------ ------------ ----------------------------------------
astropy            4.0.dev25616 /home/archibald/projects/astropy/astropy
atomicwrites       1.3.0        
attrs              19.1.0       
filelock           3.0.12       
importlib-metadata 0.19         
more-itertools     7.2.0        
numpy              1.17.1       
packaging          19.1         
pip                19.2.3       
pkg-resources      0.0.0        
pluggy             0.12.0       
psutil             5.6.3        
py                 1.8.0        
pyparsing          2.4.2        
pytest             5.1.1        
pytest-arraydiff   0.3          
pytest-astropy     0.5.0        
pytest-doctestplus 0.3.0        
pytest-openfiles   0.4.0        
pytest-remotedata  0.3.2        
setuptools         41.2.0       
six                1.12.0       
toml               0.10.0       
tox                3.13.2       
virtualenv         16.7.4       
wcwidth            0.1.7        
wheel              0.33.6       
zipp               0.6.0        

The full error message is:

_______________________________________________ test_download_parallel ________________________________________________
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/home/archibald/projects/astropy/astropy/astropy/utils/console.py", line 498, in __call__
    return i, self._func(arg)
  File "/home/archibald/projects/astropy/astropy/astropy/utils/data.py", line 1132, in _do_download_files_in_parallel
    return download_file(*args)
  File "/home/archibald/projects/astropy/astropy/astropy/utils/data.py", line 1012, in download_file
    with shelve.open(urlmapfn) as url2hash:
  File "/usr/lib/python3.7/shelve.py", line 243, in open
    return DbfilenameShelf(filename, flag, protocol, writeback)
  File "/usr/lib/python3.7/shelve.py", line 227, in __init__
    Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
  File "/usr/lib/python3.7/dbm/__init__.py", line 94, in open
    return mod.open(file, flag, mode)
_gdbm.error: [Errno 11] Resource temporarily unavailable
"""

The above exception was the direct cause of the following exception:

    @pytest.mark.remote_data(source='astropy')
    def test_download_parallel():
        from astropy.utils.data import download_files_in_parallel
    
        main_url = conf.dataurl
        mirror_url = conf.dataurl_mirror
        fileloc = 'intersphinx/README'
        try:
>           fnout = download_files_in_parallel([main_url, main_url + fileloc])

astropy/utils/tests/test_data.py:55: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
astropy/utils/data.py:1194: in download_files_in_parallel
    multiprocess=True)
astropy/utils/console.py:746: in map
    ipython_widget=ipython_widget)
astropy/utils/console.py:821: in map_unordered
    p.imap_unordered(function, items, chunksize=chunksize)):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <multiprocessing.pool.IMapUnorderedIterator object at 0x7f04b1625240>, timeout = None

    def next(self, timeout=None):
        with self._cond:
            try:
                item = self._items.popleft()
            except IndexError:
                if self._index == self._length:
                    raise StopIteration from None
                self._cond.wait(timeout)
                try:
                    item = self._items.popleft()
                except IndexError:
                    if self._index == self._length:
                        raise StopIteration from None
                    raise TimeoutError from None
    
        success, value = item
        if success:
            return value
>       raise value
E       _gdbm.error: [Errno 11] Resource temporarily unavailable

/usr/lib/python3.7/multiprocessing/pool.py:748: error

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
aarchibacommented, Oct 30, 2019

This happened when there was contention for the shelve - it was a single-writer or multiple-reader limit. It is now protected by the lock, and the recurrence of this problem is tested for with several fairly intensive parallel downloads using both threads and multiprocessing. I think it’s fair to say this is gone.

0reactions
pllimcommented, Oct 30, 2019

@aarchiba , is this still an issue with #9182 merged?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Coverage in parallel failing · Issue #883 · nedbat/coveragepy
When we run pytest using multiple processes with coverage on our tests, we get an error message that seems to be related to...
Read more >
Testing Guidelines — Astropy v5.3.dev149+gd9cb18f5a
Fail when any tests leave files open. Off by default, because this adds extra run time to the test suite. Requires the psutil...
Read more >
DataTrue: Audit Web Tags - Monitor Data Quality - Testing Tool
DataTrue can automate tag testing and monitoring for your websites, mobile apps, and email campaigns — while ensuring you deliver accurate analytics data....
Read more >
Azure Arc-enabled data services - Automated validation testing
Running containerized validation tests on any Kubernetes Cluster. ... container leverages Sonobuoy to trigger parallel integration tests.
Read more >
DataTrue Test Builder
Make testing and monitoring of tags simple with DataTrue. Use the DataTrue Test Builder tool to easily create tests in DataTrue.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found