question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IndexError: out of bounds

See original GitHub issue

This wave file: pl.zip

This code:

import torch, transformers, ctc_segmentation
import soundfile

# wav2vec2
model_file = 'jonatasgrosman/wav2vec2-large-xlsr-53-polish'
vocab_dict = {"<pad>": 0, "<s>": 1, "</s>": 2, "<unk>": 3, "|": 4, "A": 5, "I": 6, "E": 7, "O": 8, "Z": 9, "N": 10, "S": 11, "W": 12, "R": 13, "C": 14, "Y": 15, "M": 16, "T": 17, "D": 18, "K": 19, "P": 20, "Ł": 21, "J": 22, "U": 23, "L": 24, "B": 25, "Ę": 26, "G": 27, "Ą": 28, "Ż": 29, "H": 30, "Ś": 31, "Ó": 32, "Ć": 33, "F": 34, "Ń": 35, "Ź": 36, "V": 37, "-": 38, "Q": 39, "X": 40, "'": 41}

processor = transformers.Wav2Vec2Processor.from_pretrained( model_file )
model = transformers.Wav2Vec2ForCTC.from_pretrained( model_file )

speech_array, sampling_rate = soundfile.read( '/tmp/pl.wav' )
assert sampling_rate == 16000
features = processor(speech_array,sampling_rate=16000, return_tensors="pt")
input_values = features.input_values
attention_mask = features.attention_mask
with torch.no_grad():
    logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
transcription = transcription.lower().split()

# ctc-segmentation
with torch.no_grad():
    softmax = torch.nn.LogSoftmax(dim=-1)
    lpz = softmax(logits)[0].cpu().numpy()
config = ctc_segmentation.CtcSegmentationParameters()
config.index_duration = speech_array.shape[0] / lpz.shape[0] / sampling_rate
char_list = [x.lower() for x in vocab_dict.keys()]
ground_truth_mat, utt_begin_indices = ctc_segmentation.prepare_text(config, transcription,char_list)
timings, char_probs, state_list = ctc_segmentation.ctc_segmentation(config, lpz, ground_truth_mat)
segments = ctc_segmentation.determine_utterance_segments(config, utt_begin_indices, char_probs, timings, transcription)

Console:

Traceback (most recent call last):
  File "ctc.py", line 31, in <module>
    segments = ctc_segmentation.determine_utterance_segments(config, utt_begin_indices, char_probs, timings, transcription)
  File "/home/max/.local/lib/python3.8/site-packages/ctc_segmentation/ctc_segmentation.py", line 387, in determine_utterance_segments
    start = compute_time(utt_begin_indices[i], "begin")
  File "/home/max/.local/lib/python3.8/site-packages/ctc_segmentation/ctc_segmentation.py", line 380, in compute_time
    return max(timings[index + 1] - 0.5, middle)
IndexError: index 450 is out of bounds for axis 0 with size 450

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
lumakucommented, Jun 8, 2021

So, the char list contains - as a character, the last utterance of the transcription consists of a single -, and at the same time, - is in the excluded characters list. Then, the last utterance is omitted / ignored in the ground_truth_mat and when obtaining the segments, the last utterance is missing.

You could solve this by updating the config object:

char_list = [x.lower() for x in vocab_dict.keys()]
config = ctc_segmentation.CtcSegmentationParameters(char_list=char_list) # note: char_list is set here instead of at prepare_text
config.update_exluded_characters()
config.index_duration = speech_array.shape[0] / lpz.shape[0] / sampling_rate
ground_truth_mat, utt_begin_indices = ctc_segmentation.prepare_text(config, transcription)

To circumvent such issues, you could directly use the token list that you obtained from the ASR model together with prepare_token_list.

1reaction
doublexcommented, Jun 8, 2021

Your code works really great. I wanted to make a joke. Use the example as you like (no need to mentioning me).

Read more comments on GitHub >

github_iconTop Results From Across the Web

IndexError: index out of bounds - python - Stack Overflow
From a quick look, it seems the problem is the for loop, instead of using for i in [N[0]] , are you sure...
Read more >
How to Fix IndexError in Python - Rollbar
This error occurs when an attempt is made to access an item in a list at an index which is out of bounds....
Read more >
IndexError: list index out of range : Python
This error basically means you are trying to access a value at a List index which is out of bounds i.e greater than...
Read more >
index 1000 is out of bounds for axis 0 with size 1000 - You.com
The error is telling you that you have an axis of 1000 and you try reaching the 1000th element. This is perfectly logical,...
Read more >
IndexError: index 0 is out of bounds for axis 0 with size 0
The Python "IndexError: index 0 is out of bounds for axis 0 with size 0" occurs when we try to access the first...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found