Memory file -> numpy -> memory file
See original GitHub issueThanks for good library for us. As mentioned in title I want to process video in raw byte to numpy then encode it into raw byte video. I succeed to read byte to mp4 with below comment.
decode = (
ffmpeg
.input('pipe:')
.output('pipe:')
.get_args()
)
output_data = p.communicate(input=input_data)[0]
https://github.com/kkroening/ffmpeg-python/issues/49#issuecomment-355677082
It’s okay for don’t process anything. But I want to process like your tensorflow stream example
So I tried with two process but it’s doesn’t work and process. Below test is for video(byte) => numpy => video(file)
def start_encode_process():
logger.info('Starting ffmpeg process1')
args = (
ffmpeg
.input('pipe:')
.output('pipe:', format='rawvideo', pix_fmt='rgb24')
.compile()
)
return subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
def start_decode_process(out_filename, width, height):
logger.info('Starting ffmpeg process2')
args = (
ffmpeg
.input('pipe:', format='rawvideo', pix_fmt='rgb24', s='{}x{}'.format(width, height))
.output(out_filename, pix_fmt='yuv420p')
.overwrite_output()
.compile()
)
return subprocess.Popen(args, stdin=subprocess.PIPE)
process1.stdin.write(video_byte)
while True:
in_frame = read_frame(process1, width, height)
out_frame = process_frame_simple(in_frame)
write_frame(process2, out_frame)
I think first problem is stdin. Above reading raw video byte use communicate method for stdin. But below case it’s not suit method cause need to process by each frame. Do you have any idea for this?
Thanks for reading.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:5 (2 by maintainers)
Top Results From Across the Web
numpy.memmap — NumPy v1.23 Manual
Memory -mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory.
Read more >How to use numpy file without importing into RAM?
You can do this by opening your file as a memory-mapped array. For example: import sys import numpy as np # Create a...
Read more >Find the memory size of a NumPy array - GeeksforGeeks
In this post, we will see how to find the memory size of a NumPy array. So for finding the memory size of...
Read more >4.8. Processing large NumPy arrays with memory mapping
The array is stored in a file on the hard drive, and we create a memory-mapped object to this file that can be...
Read more >In-Memory Files — rasterio documentation - Read the Docs
A GeoTIFF file in a sequence of data bytes can be opened in memory as shown below. from rasterio.io import MemoryFile with MemoryFile(data) ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Because you’re reading and writing individual pipe file descriptors with a single python process, you’re encountering a deadlock: The
process1.stdin.write
line blocks and waits until process1 is finished reading all the data that’s written. Process1 reads a bit of data, does some work, then writes to its output stream. However, because nothing is reading from process1’s output stream process1 blocks and waits. So then both your python process and process1 are blocked, and no progress is made.The reason it’s a problem in this example but fine in the tensorflow example is that each of the ffmpeg processes in the tensorflow example only uses one pipe, whereas
process1
here has both an input and output pipe (and only a single python thread).This is a common issue when working with blocking pipes, regardless of choice of language.
Here are a few options I can think of off the top of my head:
Option 1: Use threads (or python multiprocessing, gevent, etc) so that one thread is responsible for pumping data into process1, and another thread that’s responsible for pumping data out of process1 and into process2. Both threads must be okay with being blocked.
Option 2: Don’t use a stdin pipe for process1. If you don’t need to feed data to process1 from the same python process/thread that’s doing the in-memory numpy processing and can avoid doing so, then it gets a lot simpler because you avoid this deadlock scenario.
Option 3: Use non-blocking IO. Basically the
process1.stdin.write
gets replaced with a call that doesn’t block so that you avoid the deadlock. (Error handling and system-specific quirks can be really annoying here though, so YMMV; from my experience this ends up being the most complex/error-prone solution unless you have something like gevent do it for you, but in that case see option 2)Option 4: Run only one ffmpeg process at a time and use
subprocess.communicate
. The subprocesscommunicate
method gets around deadlock issues with running a child process with both stdin+stdout pipes, but you’ll need to have all the input data available beforehand, and then you’ll have to process the output of process1 all at once, meaning the entire thing has to fit in memory. The processed data is then fed into process2. If you only have a few seconds of video and don’t need it to run in realtime then this might be feasible (and pretty simple), otherwise it’s completely impractical.Some related, potentially useful search terms / research topics:
Sorry The frame rate was cause of my mistake. I read stout twice one was for test but i forgot to delete it. Anyway frame and framerate doesn’t have problem!