Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Force Align text and Audio (dataset)

See original GitHub issue

Hi @BenAAndrew I am on the step where I am trying to align the text and audio of an audiobook. I have acquired the audio and text from amazon audible. Unfortunately, I was not able to assign the help label to this issue. I don’t think I have the permission for that.

using virtualenv

In order to work through align.py, I had to modify it. After modifying I was able to run the file. Below is the modified part of the file. Also in the screenshot category I have mentioned how I am trying to execute this file.

import os
import sys
import json
import logging
import argparse
from pydub import AudioSegment

sys.path.append(".")

from search import FuzzySearch
from audio import DEFAULT_RATE, read_frames_from_file, vad_split
from dataset.transcribe import stt

Screenshots

Failure Point

failure point

Questions

Do you suggest to use a virtualenv?
Do I need to reduce the quality of wav file or the mp3 file?

Link to the dataset

Audio Dataset

I have the book.txt and the mp3 file. I have converted that mp3 to wav file when I am trying to use the align. Please let me know if you can try using my dataset. Thanks for the help in advance.

Issue Analytics

State:
Created 3 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

vinamramunot-techcommented, Mar 23, 2021

@BenAAndrew I am going to close this issue for now. As you said we can come back to discuss this at a later stage.

1reaction

vinamramunot-techcommented, Mar 18, 2021

I will create the PR for the imports. Let me know if you want me to test something regarding audio conversion.

Top Results From Across the Web

Forced Alignment: How to match audio with a transcript via ...

“According to Wiki [1], forced alignment refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate ...

A collection of links and notes on forced alignment tools - GitHub

Typical applications of forced alignment include Audio-eBooks, closed captioning, and automating the creation of training data for automated speech ...

Forced Alignment with Wav2Vec2 - PyTorch

This tutorial shows how to align transcript to speech with torchaudio , using ... First we import the necessary packages, and fetch data...

Forced alignment - NCSU Phonetics Lab - NC State University

Most forced alignment systems are based on the HTK Speech Recognition Toolkit. HTK stands for Hidden Markov Model Toolkit.

6 Forced Alignment - Kaldi Tutorial

6.1 Prepare alignment files ... To extract alignments for new transcripts and audio, you'll need to create new versions of the files in...