question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Resource leak in multiprocessing

See original GitHub issue

Hi, this is a follow-up to https://github.com/libvips/pyvips/issues/73

I implemented my patch extraction based on a process pool, as discussed there. The processes each create a VIPS image from the path (to a large file) and then extract patches from it.

def process_job(i, large_file_path):
    vips_img = pyvips.Image.new_from_file(large_file_path)
    # extract stuff from it
    area = vips_img.extract_area(x, y, width, height)
    a = np.ndarray(area.write_to_buffer(), ...)
    # ...

iterable = ((i, 'large_image.tif') for i in range(1000))
pool.imap_unordered(process_job, iterable)

This kind of function is actually called many times for each image, because I’m using the pool to submit jobs to the worker processes, which only extract a few patches at a time.

Now, after running this for about ~25 different images, the extraction silently fails. Basically, the array just gets filled with zeros or random values, probably because somehow the underlying buffer is not allocated. The problem is, it’s possible for me to reproduce, but it takes about an hour, and I haven’t found a way to shorten that (just extracting from many images works fine, it seems like it’s only an issue when calling this job function many many times as well).

This makes me think that this is a problem with synchronization, but I can’t really figure out why and the fact that I can reproduce it is strange as well (it fails almost at the same point every time).

So maybe it’s an issue with certain resources not being free’d as they should ?

I’ve looked into the debug logs (set the python logging to debug) and they are very long ofc but nothing really strange seems to happen at the point where the extraction silently fails. The only thing I can use to tell that it fails at all is that every time the job is called, a warning is issued (no resolution info for TIFF image …), which I think has been removed in recent versions, whereas for the previous images, this warning would only be issued once per process starting. So something must be different in the underlying access to the images, but I can’t figure out what.

If you have a suggestion of how to better my program (possibly sharing the VIPS images as a global inherited by the processes instead of sending the path and creating it every time, etc.), I would be grateful, too, of course !

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
MatthiasKohlcommented, Nov 27, 2018

Thank you for the advice, everything works as expected when starting vips in the child processes only, so I will close this issue.

0reactions
jcupittcommented, Nov 17, 2018

Oop, accidental close.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Memory usage keep growing with Python's multiprocessing.pool
My problem was that I was producing results from the multiprocessing pool faster than I was consuming them, so they built up in...
Read more >
Memory Leak in Python multiprocessing.Pool
The Python3.7 document adds this warning: multiprocessing.pool objects have internal resources that need to be properly managed (like any other ...
Read more >
Issue 34172: multiprocessing.Pool and ThreadPool leak ...
msg322028 ‑ (view) Author: (tzickel) * Date: 2018‑07‑20 16:45 msg322076 ‑ (view) Author: Windson Yang (Windson Yang) * Date: 2018‑07‑21 06:16 msg322090 ‑ (view) Author:...
Read more >
Understanding and Optimizing Python multi-process Memory ...
mmap is a POSIX-compliant Unix system call that maps files or devices into memory. This allows you to interact with huge files that...
Read more >
Memory leak issue with multiprocess queue #182 - GitHub
We found out that there was a memory leak in gen2-recording demo. The solution was to specify Queue size (eg. queue = Queue(50)...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found