question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Streaming inference mode

See original GitHub issue

While I was investigating how streaming works I noticed that the encoder encodes the entire speech before it fed to the decoder. I thought based on the paper (https://arxiv.org/abs/2006.14941) each encoded block is fed to the decoder once they are encoded so that the decoder does not need to wait until the entire segment is encoded. As far as I understand this is not a streaming inference mode, right? If yes, how can one run the model in the streaming inference mode?

My observation is from the following lines in the espnet/espnet2/bin/asr_inference.py/ :

# a. To device 
 batch = to_device(batch, device=self.device)
# b. Forward Encoder
enc, _ = self.asr_model.encode(**batch)
assert len(enc) == 1, len(enc)

# c. Passed the encoder result and the beam search
nbest_hyps = self.beam_search(
    x=enc[0], maxlenratio=self.maxlenratio, minlenratio=self.minlenratio
)
nbest_hyps = nbest_hyps[: self.nbest]  

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

3reactions
eml914commented, May 6, 2021

Yes, you are correct. If you want to run it in fully streaming way, you need to modify spnet/espnet2/bin/asr_inference.py or recreate it as asr_inference_streaming.py, and feed input features chunk by chunk rather than as a whole. For your reference, a friend of mine is now developing the streaming decoder. It is still under development and has not been merged yet. https://github.com/laboroai/espnet/blob/dev/real_streaming_decoder/espnet2/bin/asr_inference_streaming.py

2reactions
eml914commented, Apr 30, 2021

Thank you for your comment. The paper is based on the system that encodes each chunk then decode synchronously. However, to fit the implementation into current espnet2 pipeline, I had to once aggregate all the encoded features and fed into the decoder. In the decoder, or self.beam_search, the encoded features are split again into chunks, so that it reproduce streaming inference. self.beam_search refers espnet/nets/batch_beam_search_online_sim.py here. So, basically it is simulation of the streaming inference. I also used C++ implementation for latency evaluation in the paper, which is fully streaming (i.e. chunk-wise), but unfortunately I cannot share the code for now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Use-cases and benefits of "Streaming" inference #4572 - GitHub
Does ensemble mode (maybe contains both stateless and stateful models as a whole pipeline) support "Streaming" inference.
Read more >
Streaming Algorithms in Machine Learning
In this notebook, we will use an extremely simple “machine learning” task to learn about streaming algorithms. We will try to find the...
Read more >
Structured Streaming Programming Guide - Apache Spark
Input Sources; Schema inference and partition of streaming DataFrames/Datasets ... Complete Mode - The entire updated Result Table will be written to the ......
Read more >
Streaming Inference with Apache Beam and TFX - Databricks
In this session we will be using an LSTM Encoder-Decoder Anomaly Detection model as an example, to show the building and retraining of...
Read more >
Batch Inference vs Online Inference - ML in Production
The first question you need to answer is whether you should use batch inference or online inference to serve your models.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found