Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stack error on using mini-batch size > 1

See original GitHub issue

Describe the bug On running the below code:

cl_strategy = EWC(
    model_ft, optimizer_ft, criterion, ewc_lambda=5,
    train_mb_size=32, train_epochs=4, eval_mb_size=32
)

We get the below-mentioned error. This works fine for train_mb_size=1 and eval_mb_size=1 but gives error for any other value.

/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py in default_collate(batch)
     53             storage = elem.storage()._new_shared(numel)
     54             out = elem.new(storage)
---> 55         return torch.stack(batch, 0, out=out)
     56     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
     57             and elem_type.__name__ != 'string_':

RuntimeError: stack expects each tensor to be equal size, but got [3, 341, 500] at entry 0 and [3, 313, 500] at entry 1

To Reproduce

cl_strategy = EWC(
    model_ft, optimizer_ft, criterion, ewc_lambda=5,
    train_mb_size=32, train_epochs=4, eval_mb_size=32
)

Expected behavior It was supposed to train with the specified mini-batch size.

Issue Analytics

State:
Created 2 years ago
Comments:9 (4 by maintainers)

Top GitHub Comments

1reaction

AndreaCossucommented, Apr 12, 2021

I inspected the CUB200 dataset and the problem is related to the fact that each pattern is a 3-channel image with different height and width. So, each pattern tensors has shape (3, H, W). The dataloader tries to build the minibatch with stack and fails since dimensions are different.
I think this could be fixed by forcing a center crop of same dimensions on all patterns or by padding. However, I don’t know if there are standard practices to work with this dataset. @lrzpellegrini , @vlomonaco do you have any hints on this?

0reactions

lrzpellegrinicommented, Apr 12, 2021

Ok, I’ll try to check if a sproper implementation of the dataset exists in the wild 😉

Top Results From Across the Web

Why mini batch size is better than one single "batch" with all ...

Training with large minibatches is bad for your health. More importantly, it's bad for your test error. Friends dont let friends use minibatches...

SDG with batch size >1? - pytorch - Stack Overflow

No. Batch size = 20 means, it would process all the 20 samples and then get the scalar loss. Based on that it...

A Gentle Introduction to Mini-Batch Gradient Descent and How ...

Large values give a learning process that converges slowly with accurate estimates of the error gradient. Tip 1: A good default for batch...

ML | Mini-Batch Gradient Descent with Python - GeeksforGeeks

Make predictions on the mini-batch; Compute error in predictions ... Step #1: First step is to import dependencies, generate data for linear ...

What's the Optimal Batch Size to Train a Neural Network?

When you put m examples in a minibatch, you need to do O(m) computation and use O(m) memory, but you reduce the amount...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Stack error on using mini-batch size > 1

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

CIFAR10 and SimpleCNN training: IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Change "Scenario" in "Benchmark" Object