Error when running "Quick Tour" code snippets
See original GitHub issueEnvironment info
transformers
version: 4.9.2- Platform: Linux-5.13.0-39-generic-x86_64-with-glibc2.17
- Python version: 3.8.11
- PyTorch version (GPU?): 1.9.1 (True)
- Tensorflow version (GPU?): 2.6.0 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Parallel
@sgugger @patrickvonplaten @anton-l @Narsil
Information
Model I am using: wav2vec2
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
Hey, I’m new to Transformers so pardon me if this issue has an obvious fix I can’t think of. I was trying to go through the Quick Tour (https://huggingface.co/docs/transformers/quicktour), and I encountered an error when running the code snippets mentioned there.
To reproduce
Steps to reproduce the behavior:
from transformers import pipeline
import datasets
speech_recognizer = pipeline ("automatic-speech-recognition", model = "facebook/wav2vec2-base-960h" ,device = 0)
dataset = datasets.load_dataset("superb", name ="asr", split = "test")
files = dataset["file"]
speech_recognizer(files[:4])
Here’s the Stack Trace:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/tmp/ipykernel_16600/2678924457.py in <module>
----> 1 speech_recognizer(files[:4])
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/transformers/pipelines/automatic_speech_recognition.py in __call__(self, inputs, **kwargs)
131 inputs = ffmpeg_read(inputs, self.feature_extractor.sampling_rate)
132
--> 133 assert isinstance(inputs, np.ndarray), "We expect a numpy ndarray as input"
134 assert len(inputs.shape) == 1, "We expect a single channel audio input for AutomaticSpeechRecognitionPipeline"
135
AssertionError: We expect a numpy ndarray as input
I tried mitigating this error by converting the list of filenames to a numpy array, but I seem to get another error that I don’t know how to deal with:
from transformers import pipeline
import datasets
import numpy as np
speech_recognizer = pipeline ("automatic-speech-recognition", model = "facebook/wav2vec2-base-960h" ,device = 0)
dataset = datasets.load_dataset("superb", name ="asr", split = "test")
files = dataset["file"]
speech_recognizer(np.array(files[:4]))
Stack Trace:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_16600/437131926.py in <module>
1 import numpy as np
2
----> 3 speech_recognizer(np.array(files[:4]))
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/transformers/pipelines/automatic_speech_recognition.py in __call__(self, inputs, **kwargs)
134 assert len(inputs.shape) == 1, "We expect a single channel audio input for AutomaticSpeechRecognitionPipeline"
135
--> 136 processed = self.feature_extractor(
137 inputs, sampling_rate=self.feature_extractor.sampling_rate, return_tensors="pt"
138 )
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/transformers/models/wav2vec2/feature_extraction_wav2vec2.py in __call__(self, raw_speech, padding, max_length, pad_to_multiple_of, return_attention_mask, return_tensors, sampling_rate, **kwargs)
179 # zero-mean and unit-variance normalization
180 if self.do_normalize:
--> 181 raw_speech = self.zero_mean_unit_var_norm(raw_speech)
182
183 # convert into correct format for padding
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/transformers/models/wav2vec2/feature_extraction_wav2vec2.py in zero_mean_unit_var_norm(input_values)
84 Every array in the list is normalized to have zero mean and unit variance
85 """
---> 86 return [(x - np.mean(x)) / np.sqrt(np.var(x) + 1e-5) for x in input_values]
87
88 def __call__(
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/transformers/models/wav2vec2/feature_extraction_wav2vec2.py in <listcomp>(.0)
84 Every array in the list is normalized to have zero mean and unit variance
85 """
---> 86 return [(x - np.mean(x)) / np.sqrt(np.var(x) + 1e-5) for x in input_values]
87
88 def __call__(
<__array_function__ internals> in mean(*args, **kwargs)
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/numpy/core/fromnumeric.py in mean(a, axis, dtype, out, keepdims, where)
3417 return mean(axis=axis, dtype=dtype, out=out, **kwargs)
3418
-> 3419 return _methods._mean(a, axis=axis, dtype=dtype,
3420 out=out, **kwargs)
3421
~/miniconda3/envs/mytextattack/lib/python3.8/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims, where)
176 is_float16_result = True
177
--> 178 ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
179 if isinstance(ret, mu.ndarray):
180 ret = um.true_divide(
TypeError: cannot perform reduce with flexible type
I was wondering if someone could provide some insight on how to fix this?
Issue Analytics
- State:
- Created a year ago
- Comments:8 (6 by maintainers)
Top Results From Across the Web
Fix program errors and improve code - Visual Studio (Windows)
In this article. Build your code; Review the Error List; Use code analysis; Use Quick Actions to fix or refactor code; Run Code...
Read more >Snippets in Visual Studio Code
Snippets in Visual Studio Code. Code snippets are templates that make it easier to enter repeating code patterns, such as loops or conditional-statements....
Read more >The 7 Most Common Types of Errors in Programming and ...
Resource errors are an example of a type of error in programming that might be something for the operations team to fix rather...
Read more >Beginner's Guide to Pasting Snippets from the Web into ...
Method 1: Adding Custom Code with the WPCode Plugin (Easy). Using a code snippets plugin is the safest and most beginner-friendly way to...
Read more >Runtime Errors - GeeksforGeeks
While solving problems on online platforms, many run time errors can be faced, ... NZEC: This error denotes “Non-Zero Exit Code”.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thank you guys for the detailed feedback on my issue! Really appreciate it, especially as someone who was new to the Transformers library.
As the error suggests:
You need to load the files from filename to get a wave representation of sound.
ffmpeg
is leveraged intransformers
since it covers a super large array of different files.If you can’t install
ffmpeg
for whatever reason, you need to find a way to get those soundfiles into a 1d array at the expected sampling rate of the model (usually 16k Hz). (Everything is taken care of for you if you haveffmpeg
installed)@patrickvonplaten @sgugger should we change that example to avoid relying on
ffmpeg
? Or should we make it explicit ? I like doing ASR very simply in this example, also casually dropping to CUDA with a singledevice=0
but maybe for a quicktour we want something simpler ? (The first example is a classifier so requires basically nothing).(For audio without
ffmpeg
we would still rely onlibrosa
orsoundfile
both of which also require a library to function (likelibsndfile
, librosa can also use ffmpeg if present)).libsndfile
is not necessarily always present on all systems