Training with a Single GPU - CUDA out of memory.
See original GitHub issueHey,
I have been trying to run the repo over a custom dataset for a while now. I believe that I have the custom dataset prepared accordingly after some hassling.
However, now I am stuck with getting out of CUDA memory. Could you help me find out what to change in the configs that I can reduce the video memory occupation during training (e.g. mini-batching, etc.)?
I am currently using 2 samples per GPU with number of GPUs = 1 using the bash code below (just to get the repo working),
for FOLD in 1;
do
bash tools/dist_train_partially.sh semi ${FOLD} 5 1
done
Thanks
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top Results From Across the Web
Efficient Training on a Single GPU - Hugging Face
This guide focuses on training large models efficiently on a single GPU. ... 1)).to("cuda") >>> print_gpu_utilization() GPU memory occupied: 1343 MB.
Read more >Solving "CUDA out of memory" Error - Kaggle
If you try to train multiple models on GPU, you are most likely to encounter some error similar to this one: RuntimeError: CUDA...
Read more >Resolving CUDA Being Out of Memory With Gradient ...
Implementing gradient accumulation and automatic mixed precision to solve CUDA out of memory issue when training big deep learning models ...
Read more >cuda out of memory during training - Stack Overflow
I am using Pytorch to do a cat-dog classification. I keep getting a Cuda out of memory problem during training and validation. If...
Read more >GPU usage out of memory #9320 - ultralytics/yolov5 - GitHub
CUDA Out of Memory Solutions ... If you encounter a CUDA OOM error, the steps you can take to reduce your memory usage...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I have tried to do this and I think the only solution is to decrease the sizes of input images in your case.
https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/configs/soft_teacher/base.py#L30 https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/configs/soft_teacher/base.py#L80 https://github.com/microsoft/SoftTeacher/blob/bef9a256e5c920723280146fc66b82629b3ee9d4/configs/soft_teacher/base.py#L153 Change these lines.