distributed.worker - ERROR - list index out of range
See original GitHub issue# environment
dask.__version__ : '1.2.2'
distributed.__version__: '1.28.0'
We got the following error from distributed.worker
# error message
distributed.worker - ERROR - list index out of range
Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 2336, in execute
self.transition(key, "memory", value=value)
File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1443, in transition
state = func(key, **kwargs)
File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1562, in transition_executing_done
self.send_task_state_to_scheduler(key)
File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 1727, in send_task_state_to_scheduler
typ_serialized = dumps_function(typ)
File "/usr/local/anaconda3/lib/python3.6/site-packages/distributed/worker.py", line 3040, in dumps_function
result = cache[func]
File "/usr/local/anaconda3/lib/python3.6/site-packages/zict/lru.py", line 50, in __getitem__
self.heap[key] = self.i
File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 39, in __setitem__
self.pop(key)
File "/usr/local/anaconda3/lib/python3.6/_collections_abc.py", line 801, in pop
del self[key]
File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 78, in __delitem__
self._swap(wrapper[2], parent[2])
File "/usr/local/anaconda3/lib/python3.6/site-packages/heapdict.py", line 68, in _swap
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
IndexError: list index out of range
/usr/local/anaconda3/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 48 leaked semaphores to clean up at shutdown
len(cache))
We trace the issue, and we figure out this might be caused by the change in recent update in distributed.worker
# copy from distributed/worker.py
try:
# a 10 MB cache of deserialized functions and their bytes
from zict import LRU
cache = LRU(10000000, dict(), weight=lambda k, v: len(v))
except ImportError:
cache = dict()
def dumps_function(func):
""" Dump a function to bytes, cache functions """
try:
result = cache[func]
except KeyError:
result = pickle.dumps(func)
if len(result) < 100000:
cache[func] = result
except TypeError:
result = pickle.dumps(func)
return result
In recent change, distributed.worker use zict LRU as cache, but the LRU’s get_item is not thread safe. The following is the minimal example to reproduce the " list index out of range" error purely use LRU.
from zict import LRU
from functools import partial
import concurrent.futures
# create LRU cache
cache=LRU(2,dict())
cache[1]=1
cache[2]=2
# function to get key from cache multiple times
def get_key(key,reps):
for _ in range(reps):
cache[key]
get_key_m=partial(get_key,reps=1000000)
# test get key from multiple threads
def test_get_key():
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
[i for i in executor.map(get_key_m, [1,2])]
# this call will provide "IndexError: list index out of range"
test_get_key()
We are not able to figure out a minimal example to produce the list index out of range by using Dask directly, and hopefully the minimal example of using LRU can provide some insights. Please let me know your opinion. Thanks!
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:16 (8 by maintainers)
Top Results From Across the Web
List Index Out of Range – Python Error Message Solved
You'll get the Indexerror: list index out of range error when iterating through a list and trying to access an item that doesn't...
Read more >Dask IndexError: list index out of range - Stack Overflow
So i have folder called "data" say containing many CSV files import dask.dataframe as dd df = dd.read_csv('data/*.csv') df.head() ...
Read more >Distribution Changes Between Binary SDMs (Universal Tools ...
IndexError: list index out of range. Failed to execute (differenceBetweenBinarySDMs2). Failed at Wed Apr 06 09:02:56 2016 (Elapsed Time: 1.12 seconds).
Read more >1100005 – repo sync failure list index out of range (No ...
2. Worker heartbeats have continued to flow the entire time 3. All workers were deleted at the same moment, which means it's not...
Read more >Python IndexError: List Index Out of Range [Easy Fix] - Finxter
The error “list index out of range” arises if you access invalid indices in your ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@TomAugspurger @mrocklin, I just tried the lock modification on distributed source code and run with our application, and “list index out of range” error got resolved! Thanks for the input, and hopefully this fix can be in the recent release.
We were having a problem with occasional deadlocks on distributed cluster, eventually I narrowed it down to the traceback in the OP. Placing a thread lock on cache’s
__getitem__
and__setitem__
like in #2727 (comment) appears to eliminate the problem.