Documentation: Audio + Text Feature Extraction
See original GitHub issueThe usage instructions are missing some information on the feature preprocessing step.
- It would be helpful to give more exact instructions on how to extract spectrograms and phoneme features. Does the code expect phoneme
.lab
files from Festival, as indicated in r9y9’s deepvoice code? https://github.com/r9y9/deepvoice3_pytorch/tree/master/vctk_preprocess - is it possible to use
deepvoice
code to get the exact features expected bynonparaSeq2SeqVC
? If so, which scripts are needed?
I’m going to try out the code now, and I’ll send PRs on documentation when I’m confident I can add something.
Thanks!
Issue Analytics
- State:
- Created 4 years ago
- Comments:26 (21 by maintainers)
Top Results From Across the Web
6.2. Feature extraction — scikit-learn 1.2.0 documentation
feature_extraction module can be used to extract features in a format supported by machine learning algorithms from datasets consisting of formats such as...
Read more >Feature Extractor - Hugging Face
A feature extractor is in charge of preparing input features for audio or vision models. This includes feature extraction from sequences, e.g., ...
Read more >Audio Feature Extractions — PyTorch Tutorials 1.13.1+cu117 ...
Audio Feature Extractions ; transforms implements features as objects, using implementations from ; functional and ; torch.nn.Module . They can be serialized using ......
Read more >Visualizing Audio Data and Performing Feature Extraction
In this article, we will be visualizing audio data followed by extracting useful features from the audio. Data. Data is collected from Kaggle....
Read more >Audio Feature Extraction - Devopedia
To train any statistical or ML model, we need to first extract useful features from an audio signal. Audio feature extraction is a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
See the commit ab1f8d4 for cleaning the text alignments
Alignments are not necessary. Despite I extracted the alignments, I actually didn’t use it during training. Therefore, all you need for preparing training data is phoneme sequences. Anyway, I used “https://github.com/r9y9/deepvoice3_pytorch/blob/master/vctk_preprocess/prepare_vctk_labels.py” to get my phoneme labels. Configuring all the paths and enviroments should get the cutted phoneme labels as