Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

STT for es-MX keeps timing out (westeurope)

See original GitHub issue

I have a webapp that uses 22 languages available on Azure Speech Services.

They all use the same code and all work fine, except for: es-MX. I keep getting timeouts for this language. I don’t know when this started (I suspect this morning.)

Here is as minimal a script I could come up with:

"""
Run speech-to-text on Azure Speech using Python, from the CLI.

python3.11 minitests/azure_stt.py minitests/minitests_data/hola-que-tal.wav es-ES
"""

import argparse
import requests
import base64
import json
import os

import dotenv
dotenv.load_dotenv()



def assess_pronunciation_via_azure_rest(audio_filepath, language, reference_text=None):
    def get_chunk(audio_source, chunk_size=1024):
        while True:
            chunk = audio_source.read(chunk_size)
            if not chunk:
                break
            yield chunk

    audio_file = open(audio_filepath, 'rb') # Remember it won't work in 'wb'
    url=get_azure_rest_assessment_endpoint(language)
    headers=get_azure_rest_assessment_headers(reference_text)

    print('language:', language)
    print('url:', url)
    print('headers:', headers)

    try:
        rest_api_response = requests.post(
            url=url,
            headers=headers,
            data=get_chunk(audio_file),
            timeout=60, # in seconds
        )
    except requests.Timeout:
        return 'timeout'
    except requests.ConnectionError:
        return 'ConnectionError'

    audio_file.close()
    return rest_api_response.json()

def get_azure_rest_assessment_endpoint(language):
    profanity = 'masked' # One of: 'raw', 'masked', 'removed'. But does not do anything. TODO(): report bug.
    region = os.getenv("AZURE_SPEECH_API_REGION")
    endpoint = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language={language}&profanity={profanity}"
    return endpoint

def get_azure_rest_assessment_headers(reference_text):
    api_key = os.getenv("AZURE_SPEECH_API_KEY")
    return {
        'Accept': 'application/json;text/xml',
        'Connection': 'Keep-Alive',
        'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
        'Ocp-Apim-Subscription-Key': api_key,
        'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
        'Transfer-Encoding': 'chunked',
        'Expect': '100-continue',
    }

def get_azure_rest_assessment_parameters(text):
    parameters = {
        "Dimension": "Comprehensive",
        "EnableMiscue": True,
        "GradingSystem": "HundredMark",
        "Granularity": "Phoneme",
        "nbestPhonemeCount": 3,
        "PhonemeAlphabet": "IPA",
    }
    if text is not None:
        parameters['ReferenceText'] = text

    pronunciation_assessment_parameters_json = json.dumps(parameters) 
    pronunciation_assessment_parameters_base64 = base64.b64encode(bytes(pronunciation_assessment_parameters_json, 'utf-8'))
    pronunciation_assessment_parameters_utf8 = str(pronunciation_assessment_parameters_base64, "utf-8")
    return pronunciation_assessment_parameters_utf8



def main():
    parser = argparse.ArgumentParser(description="Assess pronunciation using Azure REST API")
    parser.add_argument("audio_filepath", help="Path to the audio file")
    parser.add_argument("language", help="Language of the pronunciation")
    parser.add_argument("--reference_text", help="Reference text (optional)", default=None)

    args = parser.parse_args()

    result = assess_pronunciation_via_azure_rest(args.audio_filepath, args.language, args.reference_text)
    print("Assessment Result:", result)

if __name__ == "__main__":
    main()

I use AZURE_SPEECH_API_REGION=westeurope.

Again, works fine for me for various languages… But keeps timing out for es-MX.

Can someone please have a look?

This is urgent & important as it breaks any STT for es-MX and the app is in production.

Thank you!

Issue Analytics

State:
Created a month ago
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

chschraecommented, Aug 14, 2023

I changed the pronunciation assessment code to check for none:

def get_azure_rest_assessment_headers(reference_text):
    api_key = os.getenv("AZURE_SPEECH_API_KEY")
    headers = {
        'Accept': 'application/json;text/xml',
        'Connection': 'Keep-Alive',
        'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
        'Ocp-Apim-Subscription-Key': api_key,
        # 'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
        'Transfer-Encoding': 'chunked',
        'Expect': '100-continue',
    }
    if reference_text is not None:
        headers['Pronunciation-Assessment'] = get_azure_rest_assessment_parameters(reference_text)
    # return {
    #     'Accept': 'application/json;text/xml',
    #     'Connection': 'Keep-Alive',
    #     'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
    #     'Ocp-Apim-Subscription-Key': api_key,
    #     'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
    #     'Transfer-Encoding': 'chunked',
    #     'Expect': '100-continue',
    # }
    return headers

It seems to work now. This could be a bug in the service and I would expect the same behavior for all languages. I will contact the service team and file a bug.

I’m going to close this issue for now since it seems there is a work around. If this is still an issue for you, feel free to re-open it.

0reactions

wangkenpucommented, Aug 17, 2023

@fabswt The issue has been fixed. You can retry your original code.