Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LocalCudaCluster freezes when trying neural network prediction

See original GitHub issue

Hi, I am new to dask and I was trying to run write a workflow to run inference on large images. I have attached the code Ive been using which should reproduce the issue I am facing.

Basically, if I use the distributed client scheduler with (Processes=False) and also when not using a scheduler, I am able to run inference of my data. However, when I try to use LocalCudaCluster as the scheduler, I run into issues.

In general, the process crashes and doesnt complete
I have tried using with it 1 GPU/2 GPUs, using single threads and multiple threads per GPU.
It does seem to work for a subset of the data (and not will my full data) (controlling dim0 in the size param in line 83), though much slower.

Quite possible, Im doing something incorrectly. The codes should help reproduce this.

Thanks for your help figuring this out.

Anas Test_prediction.zip

Issue Analytics

State:
Created 3 years ago
Comments:18 (9 by maintainers)

Top GitHub Comments

1reaction

anaszain89commented, Jun 9, 2020

Oh yes, to reduce the overall memory for testing you could reduce the bsz parameter to 8 This brings down memory consumption to ~18 GB or so.

I will test with the latest and circle back

0reactions

quasibencommented, Dec 17, 2020

Closing. @anaszain89 if you are still running into issue feel free to reopen

Read more comments on GitHub >

Top Results From Across the Web

Accelerating Deep Learning Inference via Freezing - USENIX

We now try to predict the label for input X2 as follows: after the computation at each layer, we additionally compare the obtained...

Crash Prediction Using Deep Learning in a Disorienting ...

Our goal was to train and compare recurrent neural networks (RNN) and non-RNN deep learning models to predict the occurrence of crashes ......

How to Make Predictions with Keras - Machine Learning Mastery

In this tutorial, you will discover exactly how you can make classification and regression predictions with a finalized deep learning model with ...

Training Neural Networks: Best Practices | Machine Learning

This section explains backpropagation's failure cases and the most common way to regularize a neural network.

An Improved Deep Learning Model for Traffic Crash Prediction

To deal with the limitations of statistical methodologies, the machine learning methods, including Artificial Neural Network (ANN), ...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

`test_get_device_total_memory` fails

[DOC] dask-cuda-worker with --ucx-net-devices="auto" errors