question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Force Align text and Audio (dataset)

See original GitHub issue

Hi @BenAAndrew I am on the step where I am trying to align the text and audio of an audiobook. I have acquired the audio and text from amazon audible. Unfortunately, I was not able to assign the help label to this issue. I don’t think I have the permission for that.

  • using virtualenv

In order to work through align.py, I had to modify it. After modifying I was able to run the file. Below is the modified part of the file. Also in the screenshot category I have mentioned how I am trying to execute this file.

import os
import sys
import json
import logging
import argparse
from pydub import AudioSegment

sys.path.append(".")

from search import FuzzySearch
from audio import DEFAULT_RATE, read_frames_from_file, vad_split
from dataset.transcribe import stt

Screenshots

Failure Point

failure point


Questions

  1. Do you suggest to use a virtualenv?
  2. Do I need to reduce the quality of wav file or the mp3 file?

Link to the dataset

Audio Dataset

I have the book.txt and the mp3 file. I have converted that mp3 to wav file when I am trying to use the align. Please let me know if you can try using my dataset. Thanks for the help in advance.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
vinamramunot-techcommented, Mar 23, 2021

@BenAAndrew I am going to close this issue for now. As you said we can come back to discuss this at a later stage.

1reaction
vinamramunot-techcommented, Mar 18, 2021

I will create the PR for the imports. Let me know if you want me to test something regarding audio conversion.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Forced Alignment: How to match audio with a transcript via ...
“According to Wiki [1], forced alignment refers to the process by which orthographic transcriptions are aligned to audio recordings to automatically generate ...
Read more >
A collection of links and notes on forced alignment tools - GitHub
Typical applications of forced alignment include Audio-eBooks, closed captioning, and automating the creation of training data for automated speech ...
Read more >
Forced Alignment with Wav2Vec2 - PyTorch
This tutorial shows how to align transcript to speech with torchaudio , using ... First we import the necessary packages, and fetch data...
Read more >
Forced alignment - NCSU Phonetics Lab - NC State University
Most forced alignment systems are based on the HTK Speech Recognition Toolkit. HTK stands for Hidden Markov Model Toolkit.
Read more >
6 Forced Alignment - Kaldi Tutorial
6.1 Prepare alignment files ... To extract alignments for new transcripts and audio, you'll need to create new versions of the files in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found