RuntimeError: CUDA out of memorySee original GitHub issue
I’m training Document Information Extraction for custom Dataset of 100 train, 20 validation images. This is the config that I gave:
resume_from_checkpoint_path: null result_path: "./result" pretrained_model_name_or_path: "naver-clova-ix/donut-base" dataset_name_or_paths: ["/content/drive/MyDrive/donut_1.1"] # should be prepared from https://rrc.cvc.uab.es/?ch=17 sort_json_key: True train_batch_sizes:  val_batch_sizes:  input_size: [2560, 1920] max_length: 128 align_long_axis: False # num_nodes: 8 num_nodes: 1 seed: 2022 lr: 3e-5 warmup_steps: 10000 num_training_samples_per_epoch: 39463 max_epochs: 300 max_steps: -1 num_workers: 8 val_check_interval: 1.0 check_val_every_n_epoch: 10 gradient_clip_val: 0.25 verbose: True
I’m getting this error with message:
RuntimeError: CUDA out of memory. Tried to allocate 76.00 MiB (GPU 0; 14.76 GiB total capacity; 13.48 GiB already allocated; 6.75 MiB free; 13.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I tried clearing torch cache using
Reducing the batch size didn’t help.
I tried taking a smaller dataset, (50 train, 10 validation images), which is half of the earlier dataset, the memory allocation is same “76.00 MiB”
Is there any way that I can solve this issue? Please help!
- Created 4 months ago
Top GitHub Comments
Thanks! Reducing the input_size from [2560, 1920] —> [1920, 1280] helped.
Is there a way to decrease the gpu memory consumption more? I want to finetune it on 8GB GPU