more efficient resample module
See original GitHub issueThe resampling function in Kaldi that pytorchaudio is currently using has some inefficient for loops and padding steps. I’ve put together efficient module and evaluated the performance in this notebook: https://www.kaggle.com/smallyellowduck/fast-audio-resampling-layer-in-pytorch (code is in the the notebook)
edit to make two separate comparisons of the resampling time without the file load time: Comparison 1: ‘kaiser_best’ settings in librosa vs ‘kaiser_best’ setting in the efficient pytorch resampler (should be the same setup)
librosa: 51 s
efficient pytorch resampler: 9 s
Comparison 2: default setting in torchaudio vs window=‘hann’, num_zeros=6 in the efficient pytorch resampler (should be the same set-up)
torchaudio: 10 s
efficient pytorch resampler: 1 s
The performance improvement is most substantial when the input sample rate and output sample rate are not whole number multiple of each other.
I think it would be good for torchaudio to switch to the more efficient resample module.
Before making a PR, perhaps other people have feedback about what the API for the module should look like? I have largely tried to follow the api for the resample method in librosa. Any other additional comments? @vincentqb
def __init__(self,
input_sr, output_sr, dtype,
num_zeros = 64, cutoff_ratio = 0.95, filter='kaiser', beta=14.0):
super().__init__() # init the base class
"""
This creates an object that can apply a symmetric FIR filter
based on torch.nn.functional.conv1d.
Args:
input_sr: The input sampling rate, AS AN INTEGER..
output_sr: The output sampling rate, AS AN INTEGER.
dtype: The torch dtype to use for computations
num_zeros: The number of zeros per side in the (sinc*hanning-window)
filter function. More is more accurate, but 64 is already quite a lot.
cutoff_ratio: The filter rolloff point as a fraction of the Nyquist freq.
filter: one of ['kaiser', 'kaiser_best', 'kaiser_fast', 'hann']
beta: parameter for 'kaiser' filter
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:8 (5 by maintainers)
Top GitHub Comments
Hi @small-yellow-duck
We merged #1087, which makes the resampling faster. The original implementation was mostly verbatim translate of Kaldi’s C++ source code, which performs a lot of element-wise Tensor access with indexing (like
tensor[i]
). This was not a good fit with how PyTorch works. So #1087 vectorized the internals and simplified the resampling operation to a convolution operation. This is very similar to the suggestion discussed here.Closing as this is addressed in #1487. If you would like to provide feedback, please comment in #1487.