Shouldn't the val set use the train data augment? (but test always uses test)
See original GitHub issueUsually the val set is used to choose a model or even combine the train and val set. i.e. the val set comes from the train - test split. Also, augmentation helps for models to be better, so wouldn’t it make more sense to have the val set version that might improve the model most?
With this logic wouldn’t it make more sense to do:
train_dataset.transform = train_data_transforms
valid_dataset.transform = train_data_transforms
test_dataset.transform = test_data_transforms
instead?
refs:
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Data augmentation on training set only? - Cross Validated
The test set should remain untouched for selection purposes and very very carefully used likely only to report test errors. Do you agree...
Read more >Why Do We Need a Validation Set in Addition to Training and ...
The answer is that we also do some sort of testing using the validation test and testing shouldn't be done on the same...
Read more >Data augmentation in test/validation set? - Stack Overflow
Only on training. Data augmentation is used to increase the size of the training set and to get more different images.
Read more >What is the Difference Between Test and Validation Datasets?
A validation dataset is a sample of data held back from training your model that is used to give an estimate of model...
Read more >Why no augmentation applied to test or Validation data and ...
In my view, data augmentation is not necessary on val data and test data if the purpose of doing that is ONLY for...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
That’s a good question, I think you could make the argument for using either train or test transforms for the validation tasks. The convention in the literature is to use test transforms, so we’re sticking to that.
I agree to use test transforms on validation. It reduces the variance on the validation and since your not fitting them anyway there is likely little benefit to early stop with complicated train data augmentation for valitation. Better to have a low variance estimate of an unknown distribution so to early stop more precisely.