Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

C# SDK: add RequestWordLevelConfidence and expose word level confidence in WordLevelTimingResult

See original GitHub issue

It’s currently already possible to request word level timings using

var speechConfig = SpeechConfig.FromSubscription(subscriptionKey, region);
speechConfig.RequestWordLevelTimestamps();

The results then are accessible in each WordLevelTimingResult of the enumerable speechRecognitionResult.Best().FirstOrDefault().Words.

On the same level, I’d appreciate if a RequestWordLevelConfidence method could be implemented.

I know that it’s currently already possible to use

speechConfig.SetServiceProperty("wordLevelConfidence", "true", ServicePropertyChannel.UriQueryParameter);

and then parse

speechRecognitionResult.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult)

into my own C# entities.

However, as the query parameter is already there and you already parse the JSON result into an entity, what would speak against implementing this field as well? It would save us users some critical time to not having to parse the JSON again into our own entities.

I think the SDK should take care of it and expose this part of the backend’s API as well.

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

pankoponcommented, Jun 24, 2022

Update:

WordLevelTimingResult.Confidence is available with Speech SDK 1.22.0 and later (ref. https://docs.microsoft.com/dotnet/api/microsoft.cognitiveservices.speech.wordleveltimingresult.confidence)
Word level timings and confidence are both enabled when detailed recognition results are requested, with just speechConfig.OutputFormat = OutputFormat.Detailed. No need to specifically request word level detail anymore.

1reaction

CodingOctocatcommented, Mar 18, 2022

I implemented the speech to SRT function using the Recognizing and Recognized methods, and it is word level(Infinity Approach). I didn’t found an open source implementation of the same method as mine, so I’m going to put my method on Github in the next few days, and I’ll let you know then.

add speech synthesis sample to generate srt subtitle file #1286 [DRAFT] [DO NOT MERGE] Add captioning samples #1435

Update: my implemention: Azure speech to subtitle (word-level timestamp) azure speech to subtitle word level timestamp

Top Results From Across the Web

Enable word-level confidence | Cloud Speech-to-Text ...

The following code snippet demonstrates how to enable word-level confidence in a transcription request to Speech-to-Text using local and remote files ...