STT for es-MX keeps timing out (westeurope)
See original GitHub issueI have a webapp that uses 22 languages available on Azure Speech Services.
They all use the same code and all work fine, except for: es-MX. I keep getting timeouts for this language. I don’t know when this started (I suspect this morning.)
Here is as minimal a script I could come up with:
"""
Run speech-to-text on Azure Speech using Python, from the CLI.
python3.11 minitests/azure_stt.py minitests/minitests_data/hola-que-tal.wav es-ES
"""
import argparse
import requests
import base64
import json
import os
import dotenv
dotenv.load_dotenv()
def assess_pronunciation_via_azure_rest(audio_filepath, language, reference_text=None):
def get_chunk(audio_source, chunk_size=1024):
while True:
chunk = audio_source.read(chunk_size)
if not chunk:
break
yield chunk
audio_file = open(audio_filepath, 'rb') # Remember it won't work in 'wb'
url=get_azure_rest_assessment_endpoint(language)
headers=get_azure_rest_assessment_headers(reference_text)
print('language:', language)
print('url:', url)
print('headers:', headers)
try:
rest_api_response = requests.post(
url=url,
headers=headers,
data=get_chunk(audio_file),
timeout=60, # in seconds
)
except requests.Timeout:
return 'timeout'
except requests.ConnectionError:
return 'ConnectionError'
audio_file.close()
return rest_api_response.json()
def get_azure_rest_assessment_endpoint(language):
profanity = 'masked' # One of: 'raw', 'masked', 'removed'. But does not do anything. TODO(): report bug.
region = os.getenv("AZURE_SPEECH_API_REGION")
endpoint = f"https://{region}.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language={language}&profanity={profanity}"
return endpoint
def get_azure_rest_assessment_headers(reference_text):
api_key = os.getenv("AZURE_SPEECH_API_KEY")
return {
'Accept': 'application/json;text/xml',
'Connection': 'Keep-Alive',
'Content-Type': 'audio/wav; codecs=audio/pcm; samplerate=16000',
'Ocp-Apim-Subscription-Key': api_key,
'Pronunciation-Assessment': get_azure_rest_assessment_parameters(reference_text),
'Transfer-Encoding': 'chunked',
'Expect': '100-continue',
}
def get_azure_rest_assessment_parameters(text):
parameters = {
"Dimension": "Comprehensive",
"EnableMiscue": True,
"GradingSystem": "HundredMark",
"Granularity": "Phoneme",
"nbestPhonemeCount": 3,
"PhonemeAlphabet": "IPA",
}
if text is not None:
parameters['ReferenceText'] = text
pronunciation_assessment_parameters_json = json.dumps(parameters)
pronunciation_assessment_parameters_base64 = base64.b64encode(bytes(pronunciation_assessment_parameters_json, 'utf-8'))
pronunciation_assessment_parameters_utf8 = str(pronunciation_assessment_parameters_base64, "utf-8")
return pronunciation_assessment_parameters_utf8
def main():
parser = argparse.ArgumentParser(description="Assess pronunciation using Azure REST API")
parser.add_argument("audio_filepath", help="Path to the audio file")
parser.add_argument("language", help="Language of the pronunciation")
parser.add_argument("--reference_text", help="Reference text (optional)", default=None)
args = parser.parse_args()
result = assess_pronunciation_via_azure_rest(args.audio_filepath, args.language, args.reference_text)
print("Assessment Result:", result)
if __name__ == "__main__":
main()
I use AZURE_SPEECH_API_REGION=westeurope.
Again, works fine for me for various languages… But keeps timing out for es-MX
.
Can someone please have a look?
This is urgent & important as it breaks any STT for es-MX and the app is in production.
Thank you!
Issue Analytics
- State:
- Created a month ago
- Comments:8 (5 by maintainers)
Top Results From Across the Web
Bloomberg.com
Bloomberg delivers business and markets news, data, analysis, and video to the world, featuring stories from Businessweek and Bloomberg News.
Read more >Untitled
Fiskum fly fishing enterprises, Kuch is tarah atif aslam unplugged, Incursion 2 the ... Chewed out military, Enter network lock control key vodafone....
Read more >Bloomberg customer support internship reddit. 15. Robinhood
Step 2 Reviews from Bloomberg employees about working as an Intern at Bloomberg. It is the perfect time for you to get a...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I changed the pronunciation assessment code to check for none:
It seems to work now. This could be a bug in the service and I would expect the same behavior for all languages. I will contact the service team and file a bug.
I’m going to close this issue for now since it seems there is a work around. If this is still an issue for you, feel free to re-open it.
@fabswt The issue has been fixed. You can retry your original code.