Dataloader will randomly crashed
See original GitHub issueHi.
I found that the training process will randomly crashed with RuntimeError: DataLoader worker (pid(s) 36469) exited unexpectedly, is that normal?
I use the following training command.
python tracking/train.py --script stark_s --config baseline_got10k_only --save_dir . --mode multiple --nproc_per_node 8
thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top Results From Across the Web
DataLoader randomly crashes after few epochs
I'm training with a DataLoader and it randomly crashes with this error after three epochs: Traceback (most recent call last): File "train.py", ...
Read more >DataLoader freezes randomly when num_workers ... - GitHub
The reason this happen was because DataLoader may use packages that starts K threads, and if you set up num_workers > 0 then...
Read more >Data loader opens and closes on mac
You should run the data loader binary from the terminal or use command prompt. You will get the crash log. The sdl.log file...
Read more >Dataset And Dataloader - PyTorch Beginner 09
In this part we see how we can use the built-in Dataset and DataLoader classes and improve our pipeline with batch training.
Read more >Crashing when running with large samples - TMVA
For only one variable, I don't think it is useful to BDT, you can simply apply a cut on that variable. Then you...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

It seems the problem is caused by the memory limitation of my docker env. I tried to increase it and the problem is solved.
Hello what do you mean by docker env ?