question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No punctuation and capitalisation in nbest (words) results

See original GitHub issue

Hi, I’m currently using the batch transcription REST API (2.0) and I noticed that inside Words it doesn’t have punctuation or capitalisation. Any suggestion on how to receive the NBest results (list of words) with correct capitalisation and punctuation?

Current behaviour:

                    "NBest": [
                        {
                            "Confidence": 0.812473,
                            "Lexical": "hello this is jeff",
                            "ITN": "hello this is jeff",
                            "MaskedITN": "Hello this is Jeff",
                            "Display": "Hello, this is Jeff.",
                            "Sentiment": {
                                "Negative": 0.280599,
                                "Neutral": 0.719372,
                                "Positive": 0.0,
                            },
                            "Words": [
                                {
                                    "Word": "hello",
                                    "Offset": 1000000,
                                    "Duration": 3700000,
                                    "OffsetInSeconds": 0.1,
                                    "DurationInSeconds": 3.7,
                                    "Confidence": 0.882375,
                                },
                                {
                                    "Word": "this",
                                    "Offset": 38000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 3.8,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.944702,
                                },
                                {
                                    "Word": "is",
                                    "Offset": 42000000,
                                    "Duration": 1000000,
                                    "OffsetInSeconds": 4.2,
                                    "DurationInSeconds": 0.1,
                                    "Confidence": 0.900596,
                                },
                                {
                                    "Word": "jeff",
                                    "Offset": 43000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 4.3,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.900095,
                                },
                            ],
                        }
                    ],

What I want:

                    "NBest": [
                        {
                            "Confidence": 0.812473,
                            "Lexical": "hello this is jeff",
                            "ITN": "hello this is jeff",
                            "MaskedITN": "Hello this is Jeff",
                            "Display": "Hello, this is Jeff.",
                            "Sentiment": {
                                "Negative": 0.280599,
                                "Neutral": 0.719372,
                                "Positive": 0.0,
                            },
                            "Words": [
                                {
                                    "Word": "Hello,",
                                    "Offset": 1000000,
                                    "Duration": 3700000,
                                    "OffsetInSeconds": 0.1,
                                    "DurationInSeconds": 3.7,
                                    "Confidence": 0.882375,
                                },
                                {
                                    "Word": "this",
                                    "Offset": 38000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 3.8,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.944702,
                                },
                                {
                                    "Word": "is",
                                    "Offset": 42000000,
                                    "Duration": 1000000,
                                    "OffsetInSeconds": 4.2,
                                    "DurationInSeconds": 0.1,
                                    "Confidence": 0.900596,
                                },
                                {
                                    "Word": "Jeff.",
                                    "Offset": 43000000,
                                    "Duration": 4000000,
                                    "OffsetInSeconds": 4.3,
                                    "DurationInSeconds": 0.4,
                                    "Confidence": 0.900095,
                                },
                            ],
                        }
                    ],

These are the properties that I’m using:

        properties = {
            "ProfanityFilterMode": "None",
            "PunctuationMode": "Automatic",
            "AddWordLevelTimestamps": "True",
            "AddDiarization": "True",
        }

Thank you in advance,

Marina Haack

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
nk412commented, Apr 22, 2021

This seems to be the same issue as described here https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/649

Have there been any updates on this, and/or is this available in the 3.0 API? Adding custom code to split up the display text and map it to the list of words and timestamps is possible, but it’s not ideal. In particular, the words available in NBest are split on apostrophes on proper nouns (Putin's -> [Putin, 's]).

0reactions
chlandsicommented, Aug 22, 2022

With the 3.1 version of the API (currently in preview) you can request word-level timestamps on the display form with the displayFormWordLevelTimestampsEnabled property.

Read more comments on GitHub >

github_iconTop Results From Across the Web

punctuation and capitalisation in nbest (words) results #649
Hi there! Any recommendations on how to receive the NBest results (list of words) with correct capitalisation and punctuation ?
Read more >
Punctuation and Capitalization Model — NVIDIA NeMo
By default, the model supports commas, periods, and question marks. predicts if the word should be capitalized or not. In the Punctuation and...
Read more >
Do capitalization and punctuation fall under the category of ...
So the answer is: No, neither capitalization nor punctuation are part of grammar. If English were capitalized and punctuated like German ...
Read more >
Capitalize My Title
Making title capitalization easy. Automatically capitalize & convert case of text to Title Case (in AP, APA, Chicago, MLA), sentence case, UPPERCASE, ...
Read more >
Capital Letters and Abbreviations
Capital letters are not really an aspect of punctuation, but it is convenient to deal with them here. The rules for using them...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found