question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Shouldn't the val set use the train data augment? (but test always uses test)

See original GitHub issue

Usually the val set is used to choose a model or even combine the train and val set. i.e. the val set comes from the train - test split. Also, augmentation helps for models to be better, so wouldn’t it make more sense to have the val set version that might improve the model most?

With this logic wouldn’t it make more sense to do:

        train_dataset.transform = train_data_transforms
        valid_dataset.transform = train_data_transforms
        test_dataset.transform = test_data_transforms

instead?

refs:

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
seba-1511commented, Feb 26, 2022

That’s a good question, I think you could make the argument for using either train or test transforms for the validation tasks. The convention in the literature is to use test transforms, so we’re sticking to that.

0reactions
brando90commented, Nov 15, 2022

That’s a good question, I think you could make the argument for using either train or test transforms for the validation tasks. The convention in the literature is to use test transforms, so we’re sticking to that.

I agree to use test transforms on validation. It reduces the variance on the validation and since your not fitting them anyway there is likely little benefit to early stop with complicated train data augmentation for valitation. Better to have a low variance estimate of an unknown distribution so to early stop more precisely.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data augmentation on training set only? - Cross Validated
The test set should remain untouched for selection purposes and very very carefully used likely only to report test errors. Do you agree...
Read more >
Why Do We Need a Validation Set in Addition to Training and ...
The answer is that we also do some sort of testing using the validation test and testing shouldn't be done on the same...
Read more >
Data augmentation in test/validation set? - Stack Overflow
Only on training. Data augmentation is used to increase the size of the training set and to get more different images.
Read more >
What is the Difference Between Test and Validation Datasets?
A validation dataset is a sample of data held back from training your model that is used to give an estimate of model...
Read more >
Why no augmentation applied to test or Validation data and ...
In my view, data augmentation is not necessary on val data and test data if the purpose of doing that is ONLY for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found