Augmentation Enhancements
See original GitHub issueThis issue is a continuation of #404 and #406. As mentioned in the Issue/PR above, there is room for improvement regarding the augmentation pipeline.
Before going in one direction or another, we should clarify one important question: Do we want to continue applying augmentations batch-wise or do we want to change to sample-wise augmentations?
In the vision domain the transforms are defined per sample and get loaded by multiple CPU cores in parallel. This is done by using standard dataloaders with multiple workers and implementing the transform part in the __getitem__
function of the dataset.
If we want to keep doing batch-wise transforms, augmenting on the GPU would probably make sense. For batch-wise transforms there are two options: apply the augmentation with batch-wise similar augmentation parameters or apply the augmentation with sample-wise different augmentation parameters. E.g. FTSurrogate
does the first one, and GaussianNoise
does the second one.
The core question in my opinion would be two-fold: Model performance: Do sample-wise augmentations improve the model performance? By how much? Computational Efficiency: How do the three methods compare to each other (sample-wise (CPU), batch-wise similar (GPU), batch-wise different (GPU)
I guess that the batch-wise different (GPU) method would in general be the best trade-off between performance and speed (faster than CPU, better than batch-wise GPU) but that’s just a wild guess. The problem I see with this method is that some augmentations could be very hard to implement without a for loop which would affect the speed. One other thing to keep in mind is that the implementation of the sample-wise (CPU) would be very easy and seems to be the way to go looking at the torchvision library. So it might also be sufficiently fast.
@bruAristimunha @cedricrommel what are your thoughts on this one?
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:7
Top GitHub Comments
@bruAristimunha basically everywhere where the numpy rng is used to create large random vectors. Apart from your suggestions, I would add the new version of
ft_surrogate
(#409) which currently still uses the numpy rng to generate the random phase.Further, we should also take the
get_augmentation_params
functions of the following transforms into account:BandStopFilter
FrequencyShift
SensorRotation
Mixup
SmoothTimeMask
With the
bandstop_filter
you mentioned I see one major problem, which is the notch_filer function from mne. This function only works with numpy arrays. So I would leave this one out for now, unless anyone wants to dig into torchaudio or something like that to replace mne.Hello @martinwimpff, thanks for the interest in the data augmentation submodule 😃
Concerning the Dataloader vs Dataset augmentation (tochvision-like):
This question has already been discussed quite deeply in the past (#254). It led to an implementation which is compatible with both (see for example the last cell in this tutorial). So basically today you can do both with the current implementation: either use
AugmentedDataloader
or just a regularDataloader
withtransform
defined at the level of theDataset
.Sample-wise vs batch-wise augmentation
Augmentations defined in braindecode are all random, so the way each sample in a batch is augmented varies, even when using
AugmentedDataloader
with predefined parameters for the augmentation. For example, if I useFrequencyShift
withmax_delta_freq
of2
, each EEG window in a batch will have its PSD shifted by a random quantity sampled uniformly within $\pm 2$ Hz. So the augmentation is not done batch-wise today withAugmentedDataloader
. I also don’t believe it would be better to transform all samples in a batch in the same way (e.g. translating them all with an equal frequency shift) since it would generate less variability. But this is just an intuition.If I misunderstood something and by sample-wise augmentation you mean using different operations per sample, than you might want to have a look at:
GPU vs CPU
@bruAristimunha showed that the (well-implemented) augmentation could indeed run faster on GPU. This is good enough in what concerns me on this matter 😃 But if we want to do better we should implement the bad transforms highlighted by Bruno in a GPU-friendly way. There is also the concern about the RNG location that you mentioned in #406. Changing to torch.Generators could break a lot of things I think, so my 2-cents on this would be to only do it if it is really computationally worthy.