IndexError: out of bounds
See original GitHub issueThis wave file: pl.zip
This code:
import torch, transformers, ctc_segmentation
import soundfile
# wav2vec2
model_file = 'jonatasgrosman/wav2vec2-large-xlsr-53-polish'
vocab_dict = {"<pad>": 0, "<s>": 1, "</s>": 2, "<unk>": 3, "|": 4, "A": 5, "I": 6, "E": 7, "O": 8, "Z": 9, "N": 10, "S": 11, "W": 12, "R": 13, "C": 14, "Y": 15, "M": 16, "T": 17, "D": 18, "K": 19, "P": 20, "Ł": 21, "J": 22, "U": 23, "L": 24, "B": 25, "Ę": 26, "G": 27, "Ą": 28, "Ż": 29, "H": 30, "Ś": 31, "Ó": 32, "Ć": 33, "F": 34, "Ń": 35, "Ź": 36, "V": 37, "-": 38, "Q": 39, "X": 40, "'": 41}
processor = transformers.Wav2Vec2Processor.from_pretrained( model_file )
model = transformers.Wav2Vec2ForCTC.from_pretrained( model_file )
speech_array, sampling_rate = soundfile.read( '/tmp/pl.wav' )
assert sampling_rate == 16000
features = processor(speech_array,sampling_rate=16000, return_tensors="pt")
input_values = features.input_values
attention_mask = features.attention_mask
with torch.no_grad():
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
transcription = transcription.lower().split()
# ctc-segmentation
with torch.no_grad():
softmax = torch.nn.LogSoftmax(dim=-1)
lpz = softmax(logits)[0].cpu().numpy()
config = ctc_segmentation.CtcSegmentationParameters()
config.index_duration = speech_array.shape[0] / lpz.shape[0] / sampling_rate
char_list = [x.lower() for x in vocab_dict.keys()]
ground_truth_mat, utt_begin_indices = ctc_segmentation.prepare_text(config, transcription,char_list)
timings, char_probs, state_list = ctc_segmentation.ctc_segmentation(config, lpz, ground_truth_mat)
segments = ctc_segmentation.determine_utterance_segments(config, utt_begin_indices, char_probs, timings, transcription)
Console:
Traceback (most recent call last):
File "ctc.py", line 31, in <module>
segments = ctc_segmentation.determine_utterance_segments(config, utt_begin_indices, char_probs, timings, transcription)
File "/home/max/.local/lib/python3.8/site-packages/ctc_segmentation/ctc_segmentation.py", line 387, in determine_utterance_segments
start = compute_time(utt_begin_indices[i], "begin")
File "/home/max/.local/lib/python3.8/site-packages/ctc_segmentation/ctc_segmentation.py", line 380, in compute_time
return max(timings[index + 1] - 0.5, middle)
IndexError: index 450 is out of bounds for axis 0 with size 450
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
IndexError: index out of bounds - python - Stack Overflow
From a quick look, it seems the problem is the for loop, instead of using for i in [N[0]] , are you sure...
Read more >How to Fix IndexError in Python - Rollbar
This error occurs when an attempt is made to access an item in a list at an index which is out of bounds....
Read more >IndexError: list index out of range : Python
This error basically means you are trying to access a value at a List index which is out of bounds i.e greater than...
Read more >index 1000 is out of bounds for axis 0 with size 1000 - You.com
The error is telling you that you have an axis of 1000 and you try reaching the 1000th element. This is perfectly logical,...
Read more >IndexError: index 0 is out of bounds for axis 0 with size 0
The Python "IndexError: index 0 is out of bounds for axis 0 with size 0" occurs when we try to access the first...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
So, the char list contains
-
as a character, the last utterance of the transcription consists of a single-
, and at the same time,-
is in the excluded characters list. Then, the last utterance is omitted / ignored in theground_truth_mat
and when obtaining the segments, the last utterance is missing.You could solve this by updating the config object:
To circumvent such issues, you could directly use the token list that you obtained from the ASR model together with
prepare_token_list
.Your code works really great. I wanted to make a joke. Use the example as you like (no need to mentioning me).