Strange issue with subprocess from within loky
See original GitHub issue(Apologies if this has been answered before, but searching of existing issues didn’t yield any obvious references)
I’m using loky to run parallel analysis of a video clip. So far, the library has been absolutely amazing for working with OpenCV. Recently, I added an additional job to perform some specialized audio analysis of the clip in parallel alongside the video analysis. Unfortunately, and bizarrely, the audio analysis portion seems to fail when run under loky for certain files, but operates normally when run directly in the initial process.
The audio analysis invokes ffmpeg via a subprocess.Popen call and ingests the audio data found via a PIPE. When run under loky (and only on a subset of video files), the pipe formed from stdout from ffmpeg appears to return no data. At first glance, I thought this might be an issue with ffmpeg, but the stderr output appears to be the same, and the empty stdout pipe leads me to believe something is being mis-wired when loky starts the new process.
I’ve distilled the issue down in the attached file (from the audio lib I’m using): When the function testreading is passed the path of a sufficiently large video file, the following is printed when run under loky:
<_io.BufferedReader name=0>
<_io.BufferedReader name=5>
<_io.BufferedReader name=5>: 1024
<_io.BufferedReader name=5>: 1024
<_io.BufferedReader name=5>: 1024
<_io.BufferedReader name=0>: 0
<_io.BufferedReader name=5>: 82
<_io.BufferedReader name=5>: 0
As you can see, the buffered reader for stdout (fd 0) immediately returns with 0 bytes found, while stderr works as expected.
When run not under loky, both buffers return the data expected.
Code to reproduce the issue (given .txt due to GitHub limitations):
lokyissue.txt
OS: Ubuntu 18.04 Loky version: 2.6.0 Python version: 3.6.9
Issue Analytics
- State:
- Created 3 years ago
- Comments:6

Top Related StackOverflow Question
If anyone was interested, it turns out that when loky spawns a new process, the
stdinis not assigned anything (or assignedNone). As a result, when launchingffmpegviasubprocess, ffmpeg is fed garbage data on stdin. Setting thestdintoPIPEappears to have solved the problem.Actually this causes a lot of struggle because it’s not clear where problem is coming from. I was parsing audio data using joblib and some audios (just a fraction) were 0 length. No such problem with other backends