question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Copy Memory-mapped file into CuPy array

See original GitHub issue

Would it be possible to copy the output of mmap.mmap into a CuPy array of type void? I want to read in a binary file into the GPU.

What I can do now:

  1. I can use the examples here and then use cp.asarray.

That’s not optimal, so I’l like to read the binary straight into GPU memory. 2. I’m able to do this with C++ here. But will require Cython binding and memory management handling for the rest of our CuPy functions.

Would it possible to implement something like #2 with CuPy.

Example 1. Copy data from file to mm as raw data. Then copy to CuPy array and cast?

with open(filename, "r+") as f:
    mm = mmap.mmap(
        f.fileno(),
        num_bytes,
        flags=mmap.MAP_PRIVATE,
        prot=mmap.PROT_READ,
    )

d_binary = cp.asarray(mm, dtype=cp.complex64)

Example 2. Copy data from file to mm as raw data. Then copy binary data to CuPy array? Once on the GPU, I can launch a kernel to reinterpret the data.

with open(filename, "r+") as f:
    mm = mmap.mmap(
        f.fileno(),
        num_bytes,
        flags=mmap.MAP_PRIVATE,
        prot=mmap.PROT_READ,
    )

d_binary = cp.asarray(mm)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:17 (16 by maintainers)

github_iconTop GitHub Comments

4reactions
jakirkhamcommented, Jun 23, 2020

It might be worth exploring different mmap flags as well, @mnicely.

In particular there are some MAP_HUGE* flags, which use larger page sizes, which would allow the GPU to copy more data over to device at once and perform fewer copies for the same total amount of data. NumPy does something similar for memory it allocates, which winds up being pretty useful.

Another interesting option is MAP_LOCKED. This would allow one to page lock all of the memory, which effectively is like making the page size the entire block of memory and not allowing the system to unpage it. Though the man page suggests using mlock if one really wants to avoid page faults (which we would). I don’t see a Python implementation of this, but it should be accessible through ctypes or Cython.

Would pick either hugepages or page locking. I don’t think these would make sense together (though anyone please feel free to correct me if I’m wrong here).

Also make sure to unmap files once done with them (by using Python context managers or explicitly calling .close()). Otherwise large amounts of page cache will remain occupied and degrade overall program and/or system performance.

Python may not include all of these flags. So it is possible one would need to look in the corresponding Linux header and figure out the values for these flags. Don’t forget to bitwise-or flags since these are just being passed straight to C, which would interpret them that way.

2reactions
leofangcommented, Jun 12, 2020

That’s exactly right. Zero-copy is the most common reason to use mmap. Another way to wrap a mmap with a NumPy array is to do this:

mm = mmap.mmap(...)
arr = np.ndarray(..., buffer=mm, ...)
Read more comments on GitHub >

github_iconTop Results From Across the Web

numpy - How to use CUDA pinned "zero-copy" memory for a ...
So, now I am trying to figure out a way to use pinned 'zero-copy' memory to handle a memory mapped file which would...
Read more >
cupy.load — CuPy 11.4.0 documentation
This function just calls numpy.load and then sends the arrays to the current device. NPZ file is converted to NpzFile object, which defers...
Read more >
The mmap() copy-on-write trick: reducing memory usage of ...
Copying a NumPy array and modifying it doubles the memory usage. ... To use mmap() in this mode, we need a backing file....
Read more >
How to use CUDA pinned “zero-copy” memory for a ... - Reddit
CuPy can't handle mmap memory. So, CuPy uses GPU memory directly in default. https://docs-cupy.chainer.org/en/stable/reference/generated/cupy.
Read more >
chainer/develop-cupy - Gitter
All the functions have been implemented in cupy(or seems so from the docs). Sameroom ... In this case the memory mapped file was...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found