question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Documentation: Audio + Text Feature Extraction

See original GitHub issue

The usage instructions are missing some information on the feature preprocessing step.

  • It would be helpful to give more exact instructions on how to extract spectrograms and phoneme features. Does the code expect phoneme .lab files from Festival, as indicated in r9y9’s deepvoice code? https://github.com/r9y9/deepvoice3_pytorch/tree/master/vctk_preprocess
  • is it possible to use deepvoice code to get the exact features expected by nonparaSeq2SeqVC? If so, which scripts are needed?

I’m going to try out the code now, and I’ll send PRs on documentation when I’m confident I can add something.

Thanks!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:26 (21 by maintainers)

github_iconTop GitHub Comments

2reactions
jxzhangggcommented, Dec 20, 2019

See the commit ab1f8d4 for cleaning the text alignments

2reactions
jxzhangggcommented, Dec 19, 2019

Alignments are not necessary. Despite I extracted the alignments, I actually didn’t use it during training. Therefore, all you need for preparing training data is phoneme sequences. Anyway, I used “https://github.com/r9y9/deepvoice3_pytorch/blob/master/vctk_preprocess/prepare_vctk_labels.py” to get my phoneme labels. Configuring all the paths and enviroments should get the cutted phoneme labels as

0 750000 p
750000 1050000 l
1050000 3150000 iy
3150000 5600000 z
5600000 5750000 k
5750000 8450000 ao
8450000 8600000 l
8600000 9950000 s
9950000 10249999 t
10249999 12950000 eh
12950000 13200000 l
13200000 14750000 ax
Read more comments on GitHub >

github_iconTop Results From Across the Web

6.2. Feature extraction — scikit-learn 1.2.0 documentation
feature_extraction module can be used to extract features in a format supported by machine learning algorithms from datasets consisting of formats such as...
Read more >
Feature Extractor - Hugging Face
A feature extractor is in charge of preparing input features for audio or vision models. This includes feature extraction from sequences, e.g., ...
Read more >
Audio Feature Extractions — PyTorch Tutorials 1.13.1+cu117 ...
Audio Feature Extractions ; transforms implements features as objects, using implementations from ; functional and ; torch.nn.Module . They can be serialized using ......
Read more >
Visualizing Audio Data and Performing Feature Extraction
In this article, we will be visualizing audio data followed by extracting useful features from the audio. Data. Data is collected from Kaggle....
Read more >
Audio Feature Extraction - Devopedia
To train any statistical or ML model, we need to first extract useful features from an audio signal. Audio feature extraction is a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found