question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Batch size of 1 raises shape error during training.

See original GitHub issue

A batch size of 1 raises a shape error during training. This means this also happens in case where the last batch of the epoch consists of only 1 example.

See above

What is the current behavior?

If the current behavior is a bug, please provide the steps to reproduce.

import numpy as np
from pytorch_tabnet.tab_model import TabNetClassifier

model = TabNetClassifier(
    n_d=2,
    n_a=2,
    n_steps=1,
)
model.fit(
    X_train=np.random.rand(101, 2),
    y_train=np.random.choice([0, 1], size=101),
    batch_size=4,
    virtual_batch_size=4,
    drop_last=False  # Setting this to True avoids the issue.
)

Expected behavior This should work.

Screenshots

Other relevant information: poetry version: 1.1.11 python version: 3.8.12 Operating System: macOS 11.6 Additional tools: pytorch_tabnet 3.1.1, torch 1.9.1,

Additional context

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7

github_iconTop GitHub Comments

1reaction
Optimoxcommented, Oct 27, 2021

@eduardocarvp yes we should probably do two things:

  • switch default to drop_last=True
  • if drop_last=False and X.shape[0] % batch_size == 1 raise a clear error message

I won’t work on this but I can review if you have time.

1reaction
dekrompcommented, Oct 26, 2021

@eduardocarvp No I don’t use batch size of 1. Batch size of 1 can be used to reproduce the error too. Of course I could avoid this error using drop_last=True, which I do. My point is that it does not feel like intended behavior that the drop_last is set to False by default and then training might crash because the chosen batch size causes a final batch of size 1. Really annoying for hyper-parameter tuning as you can imagine. Anyhow, if a batch size of 1 should be generally avoided at the end of the epoch then I would suggest raising a proper error message that captures this beforehand (not training for a full epoch first).

@Optimox Error message is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/abstract_model.py", line 223, in fit
    self._train_epoch(train_dataloader)
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/abstract_model.py", line 434, in _train_epoch
    batch_logs = self._train_batch(X, y)
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/abstract_model.py", line 469, in _train_batch
    output, M_loss = self.network(X)
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/tab_network.py", line 583, in forward
    return self.tabnet(x)
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/tab_network.py", line 468, in forward
    steps_output, M_loss = self.encoder(x)
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/-/envs/tabnet/lib/python3.8/site-packages/pytorch_tabnet/tab_network.py", line 150, in forward
    x = self.initial_bn(x)
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 168, in forward
    return F.batch_norm(
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/functional.py", line 2280, in batch_norm
    _verify_batch_size(input.size())
  File "/-/envs/tabnet/lib/python3.8/site-packages/torch/nn/functional.py", line 2248, in _verify_batch_size
    raise ValueError("Expected more than 1 value per channel when training, got input size {}".format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 2])

Read more comments on GitHub >

github_iconTop Results From Across the Web

Effect of batch size on training dynamics | by Kevin Shen
It has been empirically observed that smaller batch sizes not only has faster training dynamics but also generalization to the test dataset ...
Read more >
Can batch size cause incorrect predictions? - Stack Overflow
The answer is NO. The batch size on average only influences the speed of your learning, not the quality of learning. For the...
Read more >
increase fit batch size: 1->2 => Incompatible shapes on model ...
So much for trying to fix colors. Here again: I'm hoping this is an obvious error. Training a siamese net, my data looks...
Read more >
Error: Expected more than 1 value per channel when training
The error should be raised by batchnorm layers, which cannot calculate the batch statistics using a single sample. You should therefore either increase...
Read more >
What is batch size in neural network? - Cross Validated
The batch size defines the number of samples that will be propagated through the network. For instance, let's say you have 1050 training...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found