question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discussion: deprecating / removing audioread?

See original GitHub issue

Opening a conversation here to plan out how we want to handle deprecating and removing audioread as a dependency.

Background

When librosa development began, audioread was the best option available for handling audio input in diverse formats. We eventually added soundfile (libsndfile) as a secondary option for decoding, and later promoted that to a first-class dependency.

The current state of the project is that soundfile is the primary and preferred decoder package, and we fall back to audioread only when necessary (i.e., when given a file outside of soundfile’s codec support). We now have some functionality that is only supported when using soundfile: notably stream processing, but there are requests for other features (e.g. #1434 ) which could expand this list.

For quite a while, we’ve been planning to deprecate and remove audioread as a backend, and streamline the library to only use soundfile for I/O. The primary roadblock to this has been the lack of mp3 support in libsndfile, which should be changing soon. So now seems like a good time to discuss how we plan to approach the deprecation.

Audioread vs soundfile codecs

It’s worth discussing what we stand to lose by abandoning audioread (and ffmpeg, gstreamer, pymad, etc). Since soundfile’s object interface is a bit richer than audioread’s, I believe this entirely comes down to supported audio codecs.

Undoubtedly, the combination of gstreamer, ffmpeg, mad, and coreaudio provides greater format coverage than what is implemented directly in libsndfile. (See, e.g, libavcodec for ffmpeg audio codecs.) My hunch is that for the common uses of librosa, it would be exceedingly rare to encounter a format (not including mp3) that would not be supported by libsndfile. The exception here might be pulling audio directly from audio-video streams (mp4, aac, etc).

Why deprecate audioread?

There are three reasons that come to (my) mind in favor of deprecation:

  1. minimizing environment complexity
  2. functionality (e.g. stream processing)
  3. stability, speed, and (upstream) test coverage

Environment complexity

Audioread is designed to multiplex over different decoder backends, such as gstreamer or ffmpeg. This maximizes the supported audio codecs, but at the cost of having complex environments that are not easily managed by python. Support also differs across platforms, which can lead to unexpected differences in behavior. (GStreamer and ffmpeg both, in turn, multiplex over different modules, so not all environments using these packages will necessarily support the same formats.) Because these differences can be hidden by audioread (and further obscured by librosa), it can be challenging to diagnose problems.

Functionality

As far as I can tell, the soundfile API is strictly more complete than audioread in terms of what operations are supported. If all I/O is conducted through soundfile, this could open up some possibilities for more functional and streamlined API from the librosa side as well. (If nothing else, it would be more internally consistent by virtue of not needing to support multiple decoder interfaces.)

Software health

I don’t mean to suggest that audioread is unstable, but its test coverage is extremely minimal.

Moreover, its design inherently interacts with multiple different backend libraries and programs, often through subprocess pipes. This makes some operations like offset-based load potentially inefficient compared to soundfile’s seek-based implementation.

What timeline makes sense?

It would be great if we could have this deprecation cycle set to complete by the 1.0 release (aiming for 2023.01). This would mean deprecation in the 0.10 series (summer 2022). This is an aggressive timeline, but not impossible depending on how quickly things happen on the libsndfile side.

If that is not possible, I would suggest to deprecate in 1.0 and remove in a future 2.0 release. (We haven’t yet begun discussing 2.0 features, but this seems like the kind of change that would be appropriate for a major revision.) This gives a longer runway for deprecation.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

4reactions
constdcommented, Mar 29, 2022

In support of moving further towards soundfile, libsndfile just released 1.1.0 with support for mp3 files.

2reactions
bmcfeecommented, Oct 6, 2022

The conda-forge packages have now all updated to libsndfile 1.1.0 with mp3 support on all platforms, so we can consider this unblocked.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How can I suppress warnings about javacomponent being ...
% Get rid of warning about wavread() being deprecated: % "Warning: WAVREAD will be removed in a future release. Use AUDIOREAD instead.".
Read more >
Deprecated vs. Removed: What's the Difference? - MakeUseOf
Deprecation, on the other hand, means that the manufacturer discourages a feature's use but leaves it available.
Read more >
Kubernetes Deprecation Policy
Release API Versions Preferred/Storage Version Notes X v1alpha1 v1alpha1 X+1 v1alpha2 v1alpha2 v1alpha1 is removed, "action required" re... X+2 v1beta1 v1beta1 v1alpha2 is removed, "action required"...
Read more >
github library python
Extracting Stock Data Using a Python Library. How to Import Libraries into python project from GitHub. Matplotlib is a comprehensive library for creating ......
Read more >
Deprecate and remove support for a supported operating system
Tell users of the deprecation and upcoming removal of support. If you find an OS has an EOL date in the upcoming quarter,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found