question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stack error on using mini-batch size > 1

See original GitHub issue

Describe the bug On running the below code:

cl_strategy = EWC(
    model_ft, optimizer_ft, criterion, ewc_lambda=5,
    train_mb_size=32, train_epochs=4, eval_mb_size=32
)

We get the below-mentioned error. This works fine for train_mb_size=1 and eval_mb_size=1 but gives error for any other value.

/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     53             storage = elem.storage()._new_shared(numel)
     54             out = elem.new(storage)
---> 55         return torch.stack(batch, 0, out=out)
     56     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     57             and elem_type.__name__ != 'string_':

RuntimeError: stack expects each tensor to be equal size, but got [3, 341, 500] at entry 0 and [3, 313, 500] at entry 1

To Reproduce

cl_strategy = EWC(
    model_ft, optimizer_ft, criterion, ewc_lambda=5,
    train_mb_size=32, train_epochs=4, eval_mb_size=32
)

Expected behavior It was supposed to train with the specified mini-batch size.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
AndreaCossucommented, Apr 12, 2021

I inspected the CUB200 dataset and the problem is related to the fact that each pattern is a 3-channel image with different height and width. So, each pattern tensors has shape (3, H, W). The dataloader tries to build the minibatch with stack and fails since dimensions are different.
I think this could be fixed by forcing a center crop of same dimensions on all patterns or by padding. However, I don’t know if there are standard practices to work with this dataset. @lrzpellegrini , @vlomonaco do you have any hints on this?

0reactions
lrzpellegrinicommented, Apr 12, 2021

Ok, I’ll try to check if a sproper implementation of the dataset exists in the wild 😉

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why mini batch size is better than one single "batch" with all ...
Training with large minibatches is bad for your health. More importantly, it's bad for your test error. Friends dont let friends use minibatches...
Read more >
SDG with batch size >1? - pytorch - Stack Overflow
No. Batch size = 20 means, it would process all the 20 samples and then get the scalar loss. Based on that it...
Read more >
A Gentle Introduction to Mini-Batch Gradient Descent and How ...
Large values give a learning process that converges slowly with accurate estimates of the error gradient. Tip 1: A good default for batch...
Read more >
ML | Mini-Batch Gradient Descent with Python - GeeksforGeeks
Make predictions on the mini-batch; Compute error in predictions ... Step #1: First step is to import dependencies, generate data for linear ...
Read more >
What's the Optimal Batch Size to Train a Neural Network?
When you put m examples in a minibatch, you need to do O(m) computation and use O(m) memory, but you reduce the amount...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found