question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'lengths'

See original GitHub issue

Trying to find an older version of distributed that is not too buggy, having trouble.

rapids 0.14 pairs with 2.17 dask/disributed.

But 2.17 hits https://github.com/dask/distributed/issues/3851

Tried 2.18 as suggested there, but 2.18 hits below even for most basic test when have 2 GPUs.

Any thoughts? I can’t go to 2.27 used by rapids 0.14 because that causes many rapids tests to fail.

from dask.distributed import Client
from dask_cuda import LocalCUDACluster
from dask import dataframe as dd
import xgboost as xgb
def main(client):
    dask_df = dd.read_csv('creditcard.csv')
    target = 'default payment next month'
    y = dask_df['default payment next month']
    X = dask_df[dask_df.columns.difference([target])]
    dtrain = xgb.dask.DaskDMatrix(client, X, y)
    output = xgb.dask.train(client,
                            # Use GPU training algorithm
                            {'tree_method': 'gpu_hist'},
                            dtrain,
                            num_boost_round=100,
                            evals=[(dtrain, 'train')])
    booster = output['booster']  # booster is the trained model
    history = output['history']  # A dictionary containing evaluation results
    # Save the model to file
    booster.save_model('xgboost-model')
    print('Training evaluation history:', history)

    
if __name__ == '__main__':
    # `LocalCUDACluster` is used for assigning GPU to XGBoost 
    # processes. Here `n_workers` represents the number of GPUs 
    # since we use one GPU per worker process.
    with LocalCUDACluster(n_workers=2) as cluster:
        with Client(cluster) as client:
            main(client)
            

(base) jon@mr-dl10:/data/jon/h2oai.fullcondatest3$ python dask_cudf_example.py 
distributed.nanny - ERROR - Failed to start worker
Traceback (most recent call last):
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/nanny.py", line 758, in run
    await worker
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/core.py", line 236, in _
    await self.start()
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/worker.py", line 1085, in start
    await self._register_with_scheduler()
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/worker.py", line 811, in _register_with_scheduler
    types={k: typename(v) for k, v in self.data.items()},
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/worker.py", line 811, in <dictcomp>
    types={k: typename(v) for k, v in self.data.items()},
  File "/home/jon/minicondadai/lib/python3.6/_collections_abc.py", line 744, in __iter__
    yield (key, self._mapping[key])
  File "/home/jon/minicondadai/lib/python3.6/site-packages/dask_cuda/device_host_file.py", line 150, in __getitem__
    return self.host_buffer[key]
  File "/home/jon/minicondadai/lib/python3.6/site-packages/zict/buffer.py", line 78, in __getitem__
    return self.slow_to_fast(key)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/zict/buffer.py", line 65, in slow_to_fast
    value = self.slow[key]
  File "/home/jon/minicondadai/lib/python3.6/site-packages/zict/func.py", line 38, in __getitem__
    return self.load(self.d[key])
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/protocol/serialize.py", line 505, in deserialize_bytes
    frames = merge_frames(header, frames)
  File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/protocol/utils.py", line 60, in merge_frames
    lengths = list(header["lengths"])
KeyError: 'lengths'

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:40 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
quasibencommented, Dec 4, 2020

I think it would be a large undertaking to patch/work around

1reaction
quasibencommented, Dec 4, 2020

To answer your question:

should I be able to use latest dask/distributed with old rapids 0.14?

I would not expect latest dask/distributed to work that far back as a lot of changes occurred in the serialization layers between Dask and RAPIDS. The errors you posted a probably a result of those changes. @jakirkham do you have any thoughts here ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

I am getting error KeyError: 'duration' when it exists [closed]
I think that you misspelled the key 'duration', try to change: exam_df['duration'] = pd.to_datetime(i,(exam_df['Duration '])[i]). With:
Read more >
Python KeyError Exceptions and How to Handle Them
In this tutorial, you'll learn how to handle Python KeyError exceptions. They are often caused by a bad key lookup in a dictionary,...
Read more >
KeyError: "length" - load_from_disk Training Model on AWS ...
Hello everyone! I was following the workshop by @philschmid - MLOps - E2E Why is not working anymore?
Read more >
keyerror in Python – How to Fix Dictionary Error
When working with dictionaries in Python, a KeyError gets raised when you try to access an item that doesn't exist in a Python...
Read more >
How to Fix: KeyError in Pandas - GeeksforGeeks
How to Fix: ValueError: Operands could not be broadcast together with shapes? 8. How to Fix: ValueError: All arrays must be of the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found