question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

distributed.worker - ERROR - list index out of range

See original GitHub issue
# environment
dask.__version__ : '1.2.2'
distributed.__version__:  '1.28.0'

We got the following error from distributed.worker

# error message
distributed.worker - ERROR - list index out of range
Traceback (most recent call last):
  File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 2336, in execute
    self.transition(key, "memory", value=value)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1443, in transition
    state = func(key, **kwargs)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1562, in transition_executing_done
    self.send_task_state_to_scheduler(key)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1727, in send_task_state_to_scheduler
    typ_serialized = dumps_function(typ)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 3040, in dumps_function
    result = cache[func]
  File "/usr/local/anaconda3/lib/python3.6/site-packages/zict/lru.py", line 50, in __getitem__
    self.heap[key] = self.i
  File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 39, in __setitem__
    self.pop(key)
  File "/usr/local/anaconda3/lib/python3.6/_collections_abc.py", line 801, in pop
    del self[key]
  File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 78, in __delitem__
    self._swap(wrapper[2], parent[2])
  File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 68, in _swap
    self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
IndexError: list index out of range
/usr/local/anaconda3/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 48 leaked semaphores to clean up at shutdown
  len(cache))

We trace the issue, and we figure out this might be caused by the change in recent update in distributed.worker

# copy from distributed/worker.py
try:
    # a 10 MB cache of deserialized functions and their bytes
    from zict import LRU
    cache = LRU(10000000, dict(), weight=lambda k, v: len(v))
except ImportError:
    cache = dict()

def dumps_function(func):
    """ Dump a function to bytes, cache functions """
    try:
        result = cache[func]
    except KeyError:
        result = pickle.dumps(func)
        if len(result) < 100000:
            cache[func] = result
    except TypeError:
        result = pickle.dumps(func)
    return result

In recent change, distributed.worker use zict LRU as cache, but the LRU’s get_item is not thread safe. The following is the minimal example to reproduce the " list index out of range" error purely use LRU.

from zict import LRU
from functools import partial
import concurrent.futures
# create LRU cache
cache=LRU(2,dict())
cache[1]=1
cache[2]=2
# function to get key from cache multiple times
def get_key(key,reps):
    for _ in range(reps):
        cache[key]

get_key_m=partial(get_key,reps=1000000)
# test get key from multiple threads 
def test_get_key():
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        [i for i in executor.map(get_key_m, [1,2])]
    
# this call will provide "IndexError: list index out of range"
test_get_key()

We are not able to figure out a minimal example to produce the list index out of range by using Dask directly, and hopefully the minimal example of using LRU can provide some insights. Please let me know your opinion. Thanks!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:16 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
quickpandacommented, May 28, 2019

@TomAugspurger @mrocklin, I just tried the lock modification on distributed source code and run with our application, and “list index out of range” error got resolved! Thanks for the input, and hopefully this fix can be in the recent release.

0reactions
tshatrovcommented, Dec 6, 2019

We were having a problem with occasional deadlocks on distributed cluster, eventually I narrowed it down to the traceback in the OP. Placing a thread lock on cache’s __getitem__ and __setitem__ like in #2727 (comment) appears to eliminate the problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

List Index Out of Range – Python Error Message Solved
You'll get the Indexerror: list index out of range error when iterating through a list and trying to access an item that doesn't...
Read more >
Dask IndexError: list index out of range - Stack Overflow
So i have folder called "data" say containing many CSV files import dask.dataframe as dd df = dd.read_csv('data/*.csv') df.head() ...
Read more >
Distribution Changes Between Binary SDMs (Universal Tools ...
IndexError: list index out of range. Failed to execute (differenceBetweenBinarySDMs2). Failed at Wed Apr 06 09:02:56 2016 (Elapsed Time: 1.12 seconds).
Read more >
1100005 – repo sync failure list index out of range (No ...
2. Worker heartbeats have continued to flow the entire time 3. All workers were deleted at the same moment, which means it's not...
Read more >
Python IndexError: List Index Out of Range [Easy Fix] - Finxter
The error “list index out of range” arises if you access invalid indices in your ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found