question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

File objects work with librosa.load but not librosa.stream

See original GitHub issue

Description

Loading a WAV file via tf.gfile.GFile works with librosa.load but not librosa.stream. Interestingly the same error appears regardless of if it is a bucket path or a local path.

Steps/Code to Reproduce

import librosa as lr
import tensorflow as tf

path = 'gs://my-bucket/example.wav'

# Loading the entire file works as expected.
with tf.gfile.GFile(path, 'rb') as f:
    lr.load(f, 48000)

# The streaming method crashes.
with tf.gfile.GFile(path, 'rb') as f:
    assert f.seekable() is True
    for waveforms in lr.stream(f, 128, 1024, 512):
        print(waveforms.shape)

Expected Results

No errors.

Actual Results

RuntimeError: Error opening : File contains data in an unknown format.

Versions

import platform; print(platform.platform())...
Linux-4.9.0-9-amd64-x86_64-with-debian-9.9
Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516]
NumPy 1.17.0
SciPy 1.3.0
librosa 0.7.0

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
carlthomecommented, Aug 7, 2019

And super curiously, this will work fine

import soundfile as sf

with tf.gfile.GFile(path, 'rb') as f:
    for waveform in sf.blocks(f, blocksize=1024):
        pass

so perhaps it’s something with the overlapping blocks.

0reactions
bmcfeecommented, Aug 7, 2019

Thanks for tracking this down!

  • Require a sample rate to be input into librosa.stream and remove the sf.info call.

I don’t like this option, because it connotes that an automatic sampling rate conversion would be invoked. If you give the wrong sampling rate in the call, there’s no obvious recourse.

  • Check if path is a file object and call path.seek(0) after sf.info (hacky).

I agree that this is hacky, but I think it might be the best option.

  • Update PySoundFile’s sf.info function to reset the pointer or defensively copy its argument (perhaps unwanted behaviour, but cannot think of reasons why).

That’s a tricky one. Maybe it would be worth discussing with the soundfile devs?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Advanced I/O Use Cases — librosa 0.10.0.dev0 documentation
Advanced I/O Use Cases¶. This section covers advanced use cases for input and output which go beyond the I/O functionality currently provided by...
Read more >
librosa.load — librosa 0.10.0.dev0 documentation
Load an audio file as a floating point time series. Audio will be automatically resampled to the given rate (default sr=22050 ). To...
Read more >
librosa.stream — librosa 0.10.0.dev0 documentation
Stream audio in fixed-length buffers. This is primarily useful for processing large files that won't fit entirely in memory at once. Instead of...
Read more >
Streaming for large files - librosa blog
This post explains how to process large files with the new block streaming interface.
Read more >
librosa.load — librosa 0.8.1 documentation
Any codec supported by soundfile or audioread will work. Any string file paths, or any object implementing Python's file interface (e.g. pathlib.Path )...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found