Allow specifying expected output length when resampling
See original GitHub issueHi Alexandre 😃
Consider this scenario:
We have a machine learning model that takes in and outputs audio at a high sample rate. For some reasons (amongst other things execution time) the model uses an internal sample rate that is lower than the input-output. The output shape is expected to be the same as the input shape.
class ResampleWrapper(nn.Module):
"""
This class downsamples audio before it's passed to the audio denoiser model,
and upsamples the audio back to the original sample rate before returning.
"""
def __init__(
self,
model: nn.Module,
input_sample_rate: int = 48_000,
internal_sample_rate: int = 32_000,
):
super().__init__()
self.model = model
self.input_sample_rate = input_sample_rate
self.internal_sample_rate = internal_sample_rate
self.downsampler = ResampleFrac(
self.input_sample_rate, self.internal_sample_rate
)
self.upsampler = ResampleFrac(self.internal_sample_rate, self.input_sample_rate)
def forward(self, x):
"""
:param x: tensor with shape (batch_size, num_channels, num_samples)
:return: tensor with shape (batch_size, num_channels, num_samples)
"""
x = self.downsampler(x)
x = self.model(x)
x = self.upsampler(x)
return x
Depending on the exact length of the input, downsampling and then upsampling will often give an output that is off by one sample in length.
In librosa
this kind of issue can be solved by setting the fix parameter or by using fix_length manually.
I imagine that julius.ResampleFrac
could provide a parameter called something like expected_length
in its forward function that customizes the slice end offset in the end so that the result has the given length 😄
Would you like a pull request that adds this feature?
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Hey @iver56 , yes definitely length of the sequence can become problematic, especially with non trivial sample rates and when trying to align with other transforms. I would be happy to accept a PR that allows to chose a custom length between
floor(length * new_sr / old_sr), ceil(length * new_sr / old_sr)
. It might be as easy as changing this line https://github.com/adefossez/julius/blob/main/julius/resample.py#L122 but it is possible in some cases one need to do extra padding on the right too 3 lines above.Yeah, I agree, it’s fine 😃 I was just confused initially. Maybe it’s a good idea to add an example of this use case (downsample, do something, then upsample) to the documentation?