question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I am using a pretrained model from

https://huggingface.co/speechbrain/asr-transformer-transformerlm-librispeech

with EncoderDecoderASR to decode the test-clean and test-other datasets from LibriSpeech on a v100 NVIDIA GPU with a batch size of 2.

There are 1310 batches in total in test-clean and the following shows the timestamps of the decoding process for test-clean:

2021-08-12 12:29:41,598 INFO [sp-main.py:56] Decode test-clean started
2021-08-12 12:29:41,602 INFO [sp-main.py:63] Processing 0/1310
2021-08-12 12:29:46,227 INFO [sp-main.py:63] Processing 10/1310
2021-08-12 12:29:52,860 INFO [sp-main.py:63] Processing 20/1310
2021-08-12 12:30:00,492 INFO [sp-main.py:63] Processing 30/1310
2021-08-12 12:30:08,520 INFO [sp-main.py:63] Processing 40/1310
2021-08-12 12:30:16,347 INFO [sp-main.py:63] Processing 50/1310
2021-08-12 12:30:24,998 INFO [sp-main.py:63] Processing 60/1310
2021-08-12 12:30:33,016 INFO [sp-main.py:63] Processing 70/1310
2021-08-12 12:30:41,081 INFO [sp-main.py:63] Processing 80/1310
2021-08-12 12:30:49,583 INFO [sp-main.py:63] Processing 90/1310
2021-08-12 12:30:58,717 INFO [sp-main.py:63] Processing 100/1310
... ...

2021-08-12 12:50:06,324 INFO [sp-main.py:63] Processing 650/1310
2021-08-12 12:50:44,209 INFO [sp-main.py:63] Processing 660/1310
2021-08-12 12:51:21,527 INFO [sp-main.py:63] Processing 670/1310
2021-08-12 12:51:59,841 INFO [sp-main.py:63] Processing 680/1310
2021-08-12 12:52:40,108 INFO [sp-main.py:63] Processing 690/1310
2021-08-12 12:53:21,158 INFO [sp-main.py:63] Processing 700/1310
2021-08-12 12:54:00,136 INFO [sp-main.py:63] Processing 710/1310
2021-08-12 12:54:41,609 INFO [sp-main.py:63] Processing 720/1310

The waves inside the test-clean dataset are sorted by duration in ascending order. You can see that the processing time per 10 batches increases from 8 seconds to 40 seconds. It will take more time for later batches since they contain longer waves.

Could you share the information on the decoding speed of speechbrain on this particular pre-trained model with the test-clean dataset?


[EDITED]: The decoding time for later batches is:

2021-08-12 14:25:35,987 INFO [sp-main.py:63] Processing 1240/1310
2021-08-12 14:29:52,408 INFO [sp-main.py:63] Processing 1250/1310
2021-08-12 14:34:54,443 INFO [sp-main.py:63] Processing 1260/1310
2021-08-12 14:40:36,922 INFO [sp-main.py:63] Processing 1270/1310
2021-08-12 14:46:25,953 INFO [sp-main.py:63] Processing 1280/1310
2021-08-12 14:52:52,305 INFO [sp-main.py:63] Processing 1290/1310
2021-08-12 15:01:14,249 INFO [sp-main.py:63] Processing 1300/1310

You can see that it takes several minutes for 10 batches of long waves.

Its WER is

2021-08-12 15:12:34,329 INFO [utils.py:190] [test-clean] %WER 2.52% [1323 / 52576, 176 ins, 121 del, 1026 sub ]

The decoding time for the test-clean dataset is about 2 hours and 42 minutes (from 12:29:41 to 15:12:34)

From the paper http://www.danielpovey.com/files/2015_icassp_librispeech.pdf, the test-clean dataset contains 5.4 hours of data, so the RTF is roughly

2 hours 42 minutes / 5.4 hours = 162 / (300 + 0.4 * 60) = 162 / 324 = 0.5

The decoding log for test-other is

2021-08-12 15:12:34,539 INFO [sp-main.py:56] Decode test-other started
2021-08-12 15:12:34,545 INFO [sp-main.py:63] Processing 0/1470
2021-08-12 15:12:39,227 INFO [sp-main.py:63] Processing 10/1470
2021-08-12 15:12:45,459 INFO [sp-main.py:63] Processing 20/1470
2021-08-12 15:12:51,369 INFO [sp-main.py:63] Processing 30/1470
2021-08-12 15:12:59,177 INFO [sp-main.py:63] Processing 40/1470
2021-08-12 15:13:06,160 INFO [sp-main.py:63] Processing 50/1470
2021-08-12 15:13:13,831 INFO [sp-main.py:63] Processing 60/1470
...
...

2021-08-12 16:51:48,488 INFO [sp-main.py:63] Processing 1380/1470
2021-08-12 16:54:58,199 INFO [sp-main.py:63] Processing 1390/1470
2021-08-12 16:58:27,433 INFO [sp-main.py:63] Processing 1400/1470
2021-08-12 17:01:48,928 INFO [sp-main.py:63] Processing 1410/1470
2021-08-12 17:05:51,573 INFO [sp-main.py:63] Processing 1420/1470
2021-08-12 17:09:57,810 INFO [sp-main.py:63] Processing 1430/1470
2021-08-12 17:14:56,628 INFO [sp-main.py:63] Processing 1440/1470
2021-08-12 17:20:34,213 INFO [sp-main.py:63] Processing 1450/1470
2021-08-12 17:27:10,278 INFO [sp-main.py:63] Processing 1460/1470

Its WER is

2021-08-12 17:37:47,898 INFO [utils.py:190] [test-other] %WER 5.94% [3107 / 52343, 405 ins, 285 del, 2417 sub ]

The test-other dataset also contains 5.4 hours of data and the decoding time is 2 hours 15 minutes (from 15:12:34 to 17:37:47, so the RTF for the test-other is

2 hours 15 minutes / 5.4 hours = 135 / (300 + 0.4 * 60) = 135 / 324 = 0.417

[EDITED AGAIN]

Here is the GPU memory usage with a batch size of 2:

+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:3E:00.0 Off |                    0 |
| N/A   70C    P0   164W / 250W |  27158MiB / 32510MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+

It causes OOM if I use a batch size of 10.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:13

github_iconTop GitHub Comments

1reaction
csukuangfjcommented, Aug 17, 2021

The code for reproducing is available at

https://github.com/csukuangfj/k2_decoding_benchmark/blob/master/librispeech/sp-main.py

(With the commit fb12cee562da1f802e4c05ebfb27d4589ff88b64)

python3 ./sp-main.py ./sp-params.yaml
0reactions
TParcolletcommented, Jan 8, 2022
Read more comments on GitHub >

github_iconTop Results From Across the Web

Deciphering Speed/Survivor | Identity V Wiki - Fandom
In general, decoding speed is dependent on whether calibrations are perfect, successful, or failed, and if the survivor is interrupted by the Hunter....
Read more >
Exacrly how much reduction is decoding speed reduction?
When survivors decode together, it says decoding speed is reduced. By how much though? And does it stack with more survivors?
Read more >
Decoding arm speed during reaching | Nature Communications
Typical decoding algorithms extract velocity—a vector quantity with direction and magnitude (speed) —from neuronal firing rates.
Read more >
Decoding Speed Test - Components Repository - foobar2000
Measures decoding speed of audio files. Current version: 1.2.5, released on 2018-07-19. Change log: More average speed bug fix.
Read more >
and speed-informed kinematics decoding improves M/EEG ...
In studies that decode directional kinematics (e.g. position and velocity) from low-frequency activity, the accuracy of the decoded trajectories ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found