question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Try messing with buffer size?

See original GitHub issue

https://github.com/lhotse-speech/lhotse/blob/b05e344bbaf97db316b4c81d7c6d02b068198a59/lhotse/audio.py#L1703

@csukuangfj this is related to the slowness here https://github.com/k2-fsa/icefall/pull/312, of reading ffmpeg data. I am trying to debug it by doing strace on your ffmpeg commands, like this (as you):

for p in $(ps -u kuangfangjun | grep ffmpeg | tail -n 1 | awk '{print $1}'); do strace -T -p $p; done >& foo

To see which system calls are slow, we can do:

awk '{print $NF, $0}' < foo | sort -r | less
# output is:
<0.014715> write(1, "\355\223H<X2^<\303\n\330<.VH=\353Z\231=\203H\305=\216j\341=\366f\347="..., 1280) = 1280 <0.014715>
<0.007298> write(1, "\332\r)=p9,=>\337\t=\200\335\336<L\331\352<&\222\"=`hh=(\360\233="..., 1280) = 1280 <0.007298>
<0.000146> fstat(3, {st_mode=S_IFREG|0664, st_size=9641836, ...}) = 0 <0.000146>
<0.000099> write(2, "ffmpeg version 3.4.8-0ubuntu0.2", 31) = 31 <0.000099>
<0.000086> openat(AT_FDCWD, "/ceph-fj/fangjun/open-source-2/icefall-pruned-multi-datasets/egs/librispeech/ASR/download/GigaSpeech/audio/podcast/P0036/POD0000003582.opus", O_RDONLY) = 3 <0.000086>
<0.000055> write(1, "lW\20\272\202\372\33\272\236\t\4\272\372H\221\271\36 \313\271\200r\374\271\35#\17\272n\30\371\271"..., 1280) = 1280 <0.000055>
<0.000055> getdents(3, /* 72 entries */, 32768)    = 2208 <0.000055>
<0.000052> write(2, " Copyright (c) 2000-2020 the FFm"..., 46) = 46 <0.000052>
<0.000047> write(1, "*\3314>\313\4\21>\253W\221=\270\350\205\273\244\372\256\275\334 \31\2760\27I\276^\26m\276"..., 1280) = 1280 <0.000047>
<0.000046> write(1, "0\322i:X\t\2639\246\320\220\271/uz\272\n\244\325\272\226\360\344\272p~\264\272\222G\374\271"..., 1280) = 1280 <0.000046>
<0.000045> write(1, "t\203\354=;\250\320= z\334=\337\354\262=V\304\216=\336\222\233=Ia}=q\354o="..., 1280) = 1280 <0.000045>
<0.000044> write(1, "\362\1F:\252\225\3669NL\227:Zz\3;\300\314\332:\270e\3629{dK\271\276p5\272"..., 1280) = 1280 <0.000044>
<0.000043> write(1, "n\371\23\274\232\v\n<^\30\316<R\242\3\274\36\341\344\274`\304\254<,X\263<\324D\300\274"..., 1280) = 1280 <0.000043>
<0.000043> write(1, "\355\336\323\274\27\333s=\370\302\343\273\316p\240\275\350\204\205=\205\257\305\274\326\277C\273\266G\331\274"..., 1280) = 1280 <0.000043>
<0.000042> write(2, "  configuration: --prefix=/usr -"..., 1098) = 1098 <0.000042>
<0.000040> write(1, "\230^<\276\236cV\276\264ys\276\314<\213\276\326I\232\276bM\250\276\7\212\266\276\17\211\302\276"..., 1280) = 1280 <0.000040>
<0.000040> write(1, "\202n\244;\34\370\217\274vF\31\275\246\32\30\275\323\31f\274\234\231\212<\242{\345<\342\270\303<"..., 1280) = 1280 <0.000040>
<0.000039> write(1, "|W\0<\"f6\273\273Tb\274b\2550\274\4\371\364\272\32K\265<\214$\3=\neD="..., 1280) = 1280 <0.000039>
<0.000039> write(1, "z\266\357=O\365\361=\36O\351=2\241\312=+\347\204=.\221\203<\262\3554\275\322\313\266\275"..., 1280) = 1280 <0.000039>
<0.000039> write(1, "\230f\350\275\303\216\17\276\10\312&\276\232\350\26\276R\243\375\275\250\264\5\276\276\2\364\275b\350\10\276"..., 1280) = 1280 <0.000039>
<0.000038> write(2, "  libavcodec     57.107.100 / 57"..., 41) = 41 <0.000038>

… anyway it seems to be the case that writing takes longer than reading, i.e. it’s spending longer waiting to output data than to read data. “slow writes” happen generally once or twice per ffmpeg program, and I expect it corresponds to when a buffer gets full AND the python program it’s writing to happens to be busy doing something that it’s hard to wake up from. Now, it looks like the bufsize arg to subprocess.run (which is one of the generic kwargs, not specifically listed in the docs) defaults to -1 which means io.DEFAULT_BUFFER_SIZE which seems to be 8192. However I don’t see any obvious periodicity in how long the syscall write takes that would correspond to that buffer size. This particular ffmpeg call seems to output about 500k bytes. One way to make this a little faster might be to just add bufsize=2000000 to the subprocess.run() call in read_opus_ffmpeg(). That would buffer all the output so it never has to wait on the python program that’s calling it.

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:3
  • Comments:38 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
csukuangfjcommented, Apr 17, 2022

Here is the training log using pre-computed features on a machine with 20 CPUs and 2 dataloader workers.

2022-04-17 16:11:34,444 INFO [train.py:1069] (0/8) Sanity check -- see if any of the batches in epoch 0 would cause OOM.
2022-04-17 16:12:51,051 INFO [train.py:794] (0/8) Epoch 0, batch 50, libri_loss[loss=0.5503, simple_loss=1.101, pruned_loss=6.998, over 7279.00 frames.], tot_loss[loss=1.093, simple_loss=2.185, pruned_loss=7.294, over 321403.98 frames.], libri_tot_loss[loss=0.6873, simple_loss=1.375, pruned_loss=6.659, over 168925.85 frames.], giga_tot_loss[loss=1.523, simple_loss=3.046, pruned_loss=7.905, over 172574.28 frames.], batch size: 18lr: 3.00e-03
2022-04-17 16:13:27,281 INFO [train.py:794] (0/8) Epoch 0, batch 100, giga_loss[loss=0.5442, simple_loss=1.088, pruned_loss=7.734, over 7368.00 frames.], tot_loss[loss=0.7829, simple_loss=1.566, pruned_loss=7.314, over 573458.97 frames.], libri_tot_loss[loss=0.5878, simple_loss=1.176, pruned_loss=6.812, over 318665.67 frames.], giga_tot_loss[loss=1.023, simple_loss=2.047, pruned_loss=7.798, over 326122.50 frames.], batch size: 44lr: 3.00e-03
2022-04-17 16:14:07,214 INFO [train.py:794] (0/8) Epoch 0, batch 150, libri_loss[loss=0.4393, simple_loss=0.8785, pruned_loss=7.002, over 7364.00 frames.], tot_loss[loss=0.6557, simple_loss=1.311, pruned_loss=7.312, over 766852.85 frames.], libri_tot_loss[loss=0.5374, simple_loss=1.075, pruned_loss=6.863, over 442659.11 frames.], giga_tot_loss[loss=0.8235, simple_loss=1.647, pruned_loss=7.733, over 466818.36 frames.], batch size: 19lr: 3.00e-03
2022-04-17 16:14:49,728 INFO [train.py:794] (0/8) Epoch 0, batch 200, giga_loss[loss=0.4717, simple_loss=0.9434, pruned_loss=7.073, over 7393.00 frames.], tot_loss[loss=0.58, simple_loss=1.16, pruned_loss=7.249, over 917740.82 frames.], libri_tot_loss[loss=0.4997, simple_loss=0.9994, pruned_loss=6.922, over 575446.90 frames.], giga_tot_loss[loss=0.7285, simple_loss=1.457, pruned_loss=7.608, over 567927.20 frames.], batch size: 146lr: 3.00e-03
2022-04-17 16:16:29,609 INFO [train.py:794] (0/8) Epoch 0, batch 250, libri_loss[loss=0.4674, simple_loss=0.9347, pruned_loss=7.239, over 7227.00 frames.], tot_loss[loss=0.5359, simple_loss=1.072, pruned_loss=7.217, over 1038589.06 frames.], libri_tot_loss[loss=0.4823, simple_loss=0.9647, pruned_loss=6.972, over 669858.64 frames.], giga_tot_loss[loss=0.6521, simple_loss=1.304, pruned_loss=7.486, over 683353.00 frames.], batch size: 21lr: 3.00e-03
2022-04-17 16:17:10,028 INFO [train.py:794] (0/8) Epoch 0, batch 300, giga_loss[loss=0.5612, simple_loss=1.122, pruned_loss=7.264, over 7471.00 frames.], tot_loss[loss=0.5087, simple_loss=1.017, pruned_loss=7.184, over 1129025.56 frames.], libri_tot_loss[loss=0.4679, simple_loss=0.9358, pruned_loss=6.988, over 757952.98 frames.], giga_tot_loss[loss=0.6097, simple_loss=1.219, pruned_loss=7.412, over 776759.88 frames.], batch size: 69lr: 3.00e-03
2022-04-17 16:17:45,481 INFO [train.py:794] (0/8) Epoch 0, batch 350, libri_loss[loss=0.3962, simple_loss=0.7924, pruned_loss=7.02, over 7434.00 frames.], tot_loss[loss=0.4873, simple_loss=0.9746, pruned_loss=7.146, over 1204571.91 frames.], libri_tot_loss[loss=0.4553, simple_loss=0.9107, pruned_loss=6.988, over 827137.93 frames.], giga_tot_loss[loss=0.5739, simple_loss=1.148, pruned_loss=7.336, over 872441.52 frames.], batch size: 20lr: 3.00e-03

Note it takes about only 5 minutes to process 350 batches, which is very close to the training time of the reworked model from Dan.

1reaction
danpoveycommented, Apr 16, 2022

Great! And such a simple fix!

Read more comments on GitHub >

github_iconTop Results From Across the Web

What Buffer Size Should I Use? - Does It Affect Sound ...
In this post, we will be discussing what buffer size to use for each situation, what buffer is in audio, and if it...
Read more >
What is buffer size and Latency when recording - YouTube
In this video, I want to show you how Buffer size and Latency can affect your recording in your DAW. Purchase...
Read more >
Avoid system overloads in Logic Pro
Choose Logic Pro > Settings (or Preferences) > Audio > General, and deselect Software Monitoring. You can then set the I/O buffer size...
Read more >
Support: Why is my track lagging, glitching or stuttering?
Don't worry, this can typically be fixed by setting the studio to a higher or lower buffer size. To do this, click "Settings"...
Read more >
Optimizing FL Studio Performance
Audio settings (drivers) · Computer and Audio Interface sample rates · Increase the audio buffer length · Is your CPU running at full...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found