Info length and rate returns different values for different backends
See original GitHub issue🐛 Bug
torchaudio.info
returns the info objects directly from the respective backend. Due to same property naming, users might forget to check how the metadata is calculated. This results in metadata being reported differently depending on which backend is reported.
E.g. sox calculates the length
summed across channels whereas soundfile
does this per channel (correct)
I would propose to add wrapper for the info objects that - independent of the backend - the most important metadata (length
and rate
) is identical.
Currently, the sox backend reports a missleading length
and the rate
parameter is of type float
instead of int
.
To Reproduce
path = "any/wavfile.wav"
# soundfile
torchaudio.set_audio_backend("soundfile")
info = torchaudio.info(path)
print(si.length)
print(type(si.rate))
# sox
torchaudio.set_audio_backend("sox")
info = torchaudio.info(path)
print(si.length)
print(type(si.rate))
Expected behavior
soundfile
reports the correct metadata, sox
should be corrected so that:
# sox
torchaudio.set_audio_backend("sox")
info = torchaudio.info(path)
print(si.length // si.channels)
print(int(si.rate))
Environment
torchaudio==0.5.0 from pypi
Issue Analytics
- State:
- Created 3 years ago
- Comments:19 (19 by maintainers)
Top Results From Across the Web
Backend services overview | Load Balancing - Google Cloud
The backend service configuration contains a set of values, such as the protocol used to connect to backends, various distribution and session settings, ......
Read more >torchaudio.backend - PyTorch
Return type of torchaudio.info function. This class is used by "sox_io" backend and "soundfile" backend . Variables: sample_rate (int) – Sample rate.
Read more >Is the frontend or backend (API) responsible for formatting ...
In our company we had a discussion whether formatting data in a certain locale is the responsibility of the frontend application or of...
Read more >Configuration | Grafana Tempo documentation
You can find more about other supported syntax here. ... They must # take and return the same value as /api/search endpoint on...
Read more >Evaluating Bond Funds For Performance and Risks
Interest Rate Risks. Bond fund returns are highly dependent on the changes in general interest rates; that is, when interest rates increase, the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Regarding the first issue, this is probably a just a matter of setting the right vocabulary to make a formal distinction between frames and samples as it’s done in libsndfile. Over there:
and
which makes totally sense to me (also soundfile is the defacto standard when it comes to proper handling of audio I/O). However this would probably lead to too many changes here but it makes sense to put the definition that is used here (“we define
samples
are the number of frames in an audio signal per channel”).I agree this is probably the simplest solution
I started with a new test #639 that is expected to fail and can propose a fix for this as well ( in the same PR?)
Closing the issue as the new backends handle this properly.