Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No punctuation and capitalisation in nbest (words) results

See original GitHub issue

Hi, I’m currently using the batch transcription REST API (2.0) and I noticed that inside Words it doesn’t have punctuation or capitalisation. Any suggestion on how to receive the NBest results (list of words) with correct capitalisation and punctuation?

Current behaviour:

                    "NBest": [
                        {
                            "Confidence": 0.812473,
                            "Lexical": "hello this is jeff",
                            "ITN": "hello this is jeff",
                            "MaskedITN": "Hello this is Jeff",
                            "Display": "Hello, this is Jeff.",
                            "Sentiment": {
                                "Negative": 0.280599,
                                "Neutral": 0.719372,
                                "Positive": 0.0,
                            },
                            "Words": [
                                {
                                    "Word": "hello",
                                    "Offset": 1000000,
                                    "Duration": 3700000,
                                    "OffsetInSeconds": 0.1,
                                    "DurationInSeconds": 3.7,
                                    "Confidence": 0.882375,
                                },
                                {
                                    "Word": "this",
                                    "Offset": 38000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 3.8,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.944702,
                                },
                                {
                                    "Word": "is",
                                    "Offset": 42000000,
                                    "Duration": 1000000,
                                    "OffsetInSeconds": 4.2,
                                    "DurationInSeconds": 0.1,
                                    "Confidence": 0.900596,
                                },
                                {
                                    "Word": "jeff",
                                    "Offset": 43000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 4.3,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.900095,
                                },
                            ],
                        }
                    ],

What I want:

                    "NBest": [
                        {
                            "Confidence": 0.812473,
                            "Lexical": "hello this is jeff",
                            "ITN": "hello this is jeff",
                            "MaskedITN": "Hello this is Jeff",
                            "Display": "Hello, this is Jeff.",
                            "Sentiment": {
                                "Negative": 0.280599,
                                "Neutral": 0.719372,
                                "Positive": 0.0,
                            },
                            "Words": [
                                {
                                    "Word": "Hello,",
                                    "Offset": 1000000,
                                    "Duration": 3700000,
                                    "OffsetInSeconds": 0.1,
                                    "DurationInSeconds": 3.7,
                                    "Confidence": 0.882375,
                                },
                                {
                                    "Word": "this",
                                    "Offset": 38000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 3.8,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.944702,
                                },
                                {
                                    "Word": "is",
                                    "Offset": 42000000,
                                    "Duration": 1000000,
                                    "OffsetInSeconds": 4.2,
                                    "DurationInSeconds": 0.1,
                                    "Confidence": 0.900596,
                                },
                                {
                                    "Word": "Jeff.",
                                    "Offset": 43000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 4.3,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.900095,
                                },
                            ],
                        }
                    ],

These are the properties that I’m using:

        properties = {
            "ProfanityFilterMode": "None",
            "PunctuationMode": "Automatic",
            "AddWordLevelTimestamps": "True",
            "AddDiarization": "True",
        }

Thank you in advance,

Marina Haack

Issue Analytics

State:
Created 2 years ago
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

nk412commented, Apr 22, 2021

This seems to be the same issue as described here https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/649

Have there been any updates on this, and/or is this available in the 3.0 API? Adding custom code to split up the display text and map it to the list of words and timestamps is possible, but it’s not ideal. In particular, the words available in NBest are split on apostrophes on proper nouns (Putin's -> [Putin, 's]).

0reactions

chlandsicommented, Aug 22, 2022

With the 3.1 version of the API (currently in preview) you can request word-level timestamps on the display form with the displayFormWordLevelTimestampsEnabled property.

Top Results From Across the Web

punctuation and capitalisation in nbest (words) results #649

Hi there! Any recommendations on how to receive the NBest results (list of words) with correct capitalisation and punctuation ?

Punctuation and Capitalization Model — NVIDIA NeMo

By default, the model supports commas, periods, and question marks. predicts if the word should be capitalized or not. In the Punctuation and...

Do capitalization and punctuation fall under the category of ...

So the answer is: No, neither capitalization nor punctuation are part of grammar. If English were capitalized and punctuated like German ...

Capitalize My Title

Making title capitalization easy. Automatically capitalize & convert case of text to Title Case (in AP, APA, Chicago, MLA), sentence case, UPPERCASE, ...

Capital Letters and Abbreviations

Capital letters are not really an aspect of punctuation, but it is convenient to deal with them here. The rules for using them...