The problem about last phoneme alignment
See original GitHub issueHi, thanks for this great job, I have tried to integrate it on the top of my asr module, most of the phonemes were aligned perfect except the last, as can see in the below.
the top figure was the original wavform, and the bottom was the alignment result.
I found the wavform approach the end was cut down, and the index_duration
was right because the phonemes except the last were aligned accurately.
So how can I solve this problem? thanks in advance.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
Phoneme Alignment Based on Discriminative Learning
Phoneme alignment is the task of proper positioning of a se- quence of phonemes in relation to a corresponding continuous speech signal. This...
Read more >Speaker-Independent Phoneme Alignment Using ... - NCBI
As a result, there is no “correct” answer to the phoneme alignment problem, because phoneme boundary placement is an inherently subjective task.
Read more >Letter-Phoneme Alignment: An Exploration - ACL Anthology
2 Background. We define the letter-phoneme alignment task as the problem of inducing links between units that are related by pronunciation.
Read more >Phoneme Deletion
ABOUT THE STRATEGY. PHONEME DELETION is a strategy that helps develop students' phonemic awareness, which is part of phonological awareness.
Read more >Visualization of Speech Perception Analysis via Phoneme ...
The stimulus phonemes are aligned with the response phonemes via a modification of the Levenshtein Minimum Edit Distance algorithm. Alignment is ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
My TTS model has the same architecture as Fast Speech, except inject a speaker embedding into every timestep input.
And I think this tool is more suitable for me because the montreal forced alignment tool is a little intricate which is based on kaldi and I hope to train all models in tensorflow only without other framework. Now it solved my problem easily.
Thanks again for your timely response!
现在我有个工程,需要做音的对齐,也是声韵母,请问,用这ctc对齐的思路,能告知一点吗?谢谢@taylorlu