C# SDK: add RequestWordLevelConfidence and expose word level confidence in WordLevelTimingResult
See original GitHub issueIt’s currently already possible to request word level timings using
var speechConfig = SpeechConfig.FromSubscription(subscriptionKey, region);
speechConfig.RequestWordLevelTimestamps();
The results then are accessible in each WordLevelTimingResult
of the enumerable speechRecognitionResult.Best().FirstOrDefault().Words
.
On the same level, I’d appreciate if a RequestWordLevelConfidence
method could be implemented.
I know that it’s currently already possible to use
speechConfig.SetServiceProperty("wordLevelConfidence", "true", ServicePropertyChannel.UriQueryParameter);
and then parse
speechRecognitionResult.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult)
into my own C# entities.
However, as the query parameter is already there and you already parse the JSON result into an entity, what would speak against implementing this field as well? It would save us users some critical time to not having to parse the JSON again into our own entities.
I think the SDK should take care of it and expose this part of the backend’s API as well.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Enable word-level confidence | Cloud Speech-to-Text ...
The following code snippet demonstrates how to enable word-level confidence in a transcription request to Speech-to-Text using local and remote files ...
Read more >Enable word-level confidence - Cloud Speech-to-Text
The following code snippet demonstrates how to enable word-level confidence in a transcription request to Speech-to-Text using local and remote files.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Update:
WordLevelTimingResult.Confidence
is available with Speech SDK 1.22.0 and later (ref. https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.wordleveltimingresult.confidence)speechConfig.OutputFormat = OutputFormat.Detailed
. No need to specifically request word level detail anymore.I implemented the speech to SRT function using the
Recognizing
andRecognized
methods, and it is word level(Infinity Approach). I didn’t found an open source implementation of the same method as mine, so I’m going to put my method on Github in the next few days, and I’ll let you know then.Update: my implemention: Azure speech to subtitle (word-level timestamp)