question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

STT for es-MX keeps timing out (westeurope)

See original GitHub issue

I have a webapp that uses 22 languages available on Azure Speech Services.

They all use the same code and all work fine, except for: es-MX. I keep getting timeouts for this language. I don’t know when this started (I suspect this morning.)

Here is as minimal a script I could come up with:

"""
Run speech-to-text on Azure Speech using Python, from the CLI.

python3.11 minitests/azure_stt.py minitests/minitests_data/hola-que-tal.wav es-ES
"""

import argparse
import requests
import base64
import json
import os

import dotenv
dotenv.load_dotenv()



def assess_pronunciation_via_azure_rest(audio_filepath, language, reference_text=None):
    def get_chunk(audio_source, chunk_size=1024):
        while True:
            chunk = audio_source.read(chunk_size)
            if not chunk:
                break
            yield chunk

    audio_file = open(audio_filepath, 'rb') # Remember it won't work in 'wb'
    url=get_azure_rest_assessment_endpoint(language)
    headers=get_azure_rest_assessment_headers(reference_text)

    print('language:', language)
    print('url:', url)
    print('headers:', headers)

    try:
        rest_api_response = requests.post(
            url=url,
            headers=headers,
            data=get_chunk(audio_file),
            timeout=60, # in seconds
        )
    except requests.Timeout:
        return 'timeout'
    except requests.ConnectionError:
        return 'ConnectionError'

    audio_file.close()
    return rest_api_response.json()

def get_azure_rest_assessment_endpoint(language):
    profanity = 'masked' # One of: 'raw', 'masked', 'removed'. But does not do anything. TODO(): report bug.
    region = os.getenv("AZURE_SPEECH_API_REGION")
    endpoint = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language={language}&profanity={profanity}"
    return endpoint

def get_azure_rest_assessment_headers(reference_text):
    api_key = os.getenv("AZURE_SPEECH_API_KEY")
    return {
        'Accept': 'application/json;text/xml',
        'Connection': 'Keep-Alive',
        'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
        'Ocp-Apim-Subscription-Key': api_key,
        'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
        'Transfer-Encoding': 'chunked',
        'Expect': '100-continue',
    }

def get_azure_rest_assessment_parameters(text):
    parameters = {
        "Dimension": "Comprehensive",
        "EnableMiscue": True,
        "GradingSystem": "HundredMark",
        "Granularity": "Phoneme",
        "nbestPhonemeCount": 3,
        "PhonemeAlphabet": "IPA",
    }
    if text is not None:
        parameters['ReferenceText'] = text

    pronunciation_assessment_parameters_json = json.dumps(parameters) 
    pronunciation_assessment_parameters_base64 = base64.b64encode(bytes(pronunciation_assessment_parameters_json, 'utf-8'))
    pronunciation_assessment_parameters_utf8 = str(pronunciation_assessment_parameters_base64, "utf-8")
    return pronunciation_assessment_parameters_utf8



def main():
    parser = argparse.ArgumentParser(description="Assess pronunciation using Azure REST API")
    parser.add_argument("audio_filepath", help="Path to the audio file")
    parser.add_argument("language", help="Language of the pronunciation")
    parser.add_argument("--reference_text", help="Reference text (optional)", default=None)

    args = parser.parse_args()

    result = assess_pronunciation_via_azure_rest(args.audio_filepath, args.language, args.reference_text)
    print("Assessment Result:", result)

if __name__ == "__main__":
    main()

I use AZURE_SPEECH_API_REGION=westeurope.

Again, works fine for me for various languages… But keeps timing out for es-MX.

Can someone please have a look?

This is urgent & important as it breaks any STT for es-MX and the app is in production.

Thank you!

Issue Analytics

  • State:closed
  • Created a month ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
chschraecommented, Aug 14, 2023

I changed the pronunciation assessment code to check for none:

def get_azure_rest_assessment_headers(reference_text):
    api_key = os.getenv("AZURE_SPEECH_API_KEY")
    headers = {
        'Accept': 'application/json;text/xml',
        'Connection': 'Keep-Alive',
        'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
        'Ocp-Apim-Subscription-Key': api_key,
        # 'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
        'Transfer-Encoding': 'chunked',
        'Expect': '100-continue',
    }
    if reference_text is not None:
        headers['Pronunciation-Assessment'] = get_azure_rest_assessment_parameters(reference_text)
    # return {
    #     'Accept': 'application/json;text/xml',
    #     'Connection': 'Keep-Alive',
    #     'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
    #     'Ocp-Apim-Subscription-Key': api_key,
    #     'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
    #     'Transfer-Encoding': 'chunked',
    #     'Expect': '100-continue',
    # }
    return headers

It seems to work now. This could be a bug in the service and I would expect the same behavior for all languages. I will contact the service team and file a bug.

I’m going to close this issue for now since it seems there is a work around. If this is still an issue for you, feel free to re-open it.

0reactions
wangkenpucommented, Aug 17, 2023

@fabswt The issue has been fixed. You can retry your original code.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bloomberg.com
Bloomberg delivers business and markets news, data, analysis, and video to the world, featuring stories from Businessweek and Bloomberg News.
Read more >
Untitled
Fiskum fly fishing enterprises, Kuch is tarah atif aslam unplugged, Incursion 2 the ... Chewed out military, Enter network lock control key vodafone....
Read more >
Bloomberg customer support internship reddit. 15. Robinhood
Step 2 Reviews from Bloomberg employees about working as an Intern at Bloomberg. It is the perfect time for you to get a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found