Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Augmentation Enhancements

See original GitHub issue

This issue is a continuation of #404 and #406. As mentioned in the Issue/PR above, there is room for improvement regarding the augmentation pipeline.

Before going in one direction or another, we should clarify one important question: Do we want to continue applying augmentations batch-wise or do we want to change to sample-wise augmentations?

In the vision domain the transforms are defined per sample and get loaded by multiple CPU cores in parallel. This is done by using standard dataloaders with multiple workers and implementing the transform part in the __getitem__ function of the dataset. If we want to keep doing batch-wise transforms, augmenting on the GPU would probably make sense. For batch-wise transforms there are two options: apply the augmentation with batch-wise similar augmentation parameters or apply the augmentation with sample-wise different augmentation parameters. E.g. FTSurrogate does the first one, and GaussianNoise does the second one.

The core question in my opinion would be two-fold: Model performance: Do sample-wise augmentations improve the model performance? By how much? Computational Efficiency: How do the three methods compare to each other (sample-wise (CPU), batch-wise similar (GPU), batch-wise different (GPU)

I guess that the batch-wise different (GPU) method would in general be the best trade-off between performance and speed (faster than CPU, better than batch-wise GPU) but that’s just a wild guess. The problem I see with this method is that some augmentations could be very hard to implement without a for loop which would affect the speed. One other thing to keep in mind is that the implementation of the sample-wise (CPU) would be very easy and seems to be the way to go looking at the torchvision library. So it might also be sufficiently fast.

@bruAristimunha @cedricrommel what are your thoughts on this one?

Issue Analytics

State:
Created a year ago
Reactions:1
Comments:7

Top GitHub Comments

1reaction

martinwimpffcommented, Sep 23, 2022

@bruAristimunha basically everywhere where the numpy rng is used to create large random vectors. Apart from your suggestions, I would add the new version of ft_surrogate (#409) which currently still uses the numpy rng to generate the random phase.

Further, we should also take the get_augmentation_params functions of the following transforms into account:

BandStopFilter
FrequencyShift
SensorRotation
Mixup
SmoothTimeMask

With the bandstop_filter you mentioned I see one major problem, which is the notch_filer function from mne. This function only works with numpy arrays. So I would leave this one out for now, unless anyone wants to dig into torchaudio or something like that to replace mne.

1reaction

cedricrommelcommented, Sep 21, 2022

Hello @martinwimpff, thanks for the interest in the data augmentation submodule 😃

Concerning the Dataloader vs Dataset augmentation (tochvision-like):

This question has already been discussed quite deeply in the past (#254). It led to an implementation which is compatible with both (see for example the last cell in this tutorial). So basically today you can do both with the current implementation: either use AugmentedDataloader or just a regular Dataloader with transform defined at the level of the Dataset.

Sample-wise vs batch-wise augmentation

Augmentations defined in braindecode are all random, so the way each sample in a batch is augmented varies, even when using AugmentedDataloader with predefined parameters for the augmentation. For example, if I use FrequencyShift with max_delta_freq of 2, each EEG window in a batch will have its PSD shifted by a random quantity sampled uniformly within $\pm 2$ Hz. So the augmentation is not done batch-wise today with AugmentedDataloader. I also don’t believe it would be better to transform all samples in a batch in the same way (e.g. translating them all with an equal frequency shift) since it would generate less variability. But this is just an intuition.

If I misunderstood something and by sample-wise augmentation you mean using different operations per sample, than you might want to have a look at:

GPU vs CPU

@bruAristimunha showed that the (well-implemented) augmentation could indeed run faster on GPU. This is good enough in what concerns me on this matter 😃 But if we want to do better we should implement the bad transforms highlighted by Bruno in a GPU-friendly way. There is also the concern about the RNG location that you mentioned in #406. Changing to torch.Generators could break a lot of things I think, so my 2-cents on this would be to only do it if it is really computationally worthy.