question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for `memmap` in `Quantity`

See original GitHub issue

Description

I if possible, would like the Quantity object to support memmap arrays, meaning that given a memmap array, a quantity constructed from that array remains memory mapped.

Additional context

It is possible that Quantity supports this, but it remains unclear. Right now if I run:

import numpy as np
import astropy.units as u

a = np.memmap('test', mode="w+", shape=(3,4))
q1 = u.Quantity(a, u.kg)
q2 = u.Quantity(a, u.kg, copy=False)

print(f'{isinstance(a, np.memmap)=}')
print(f'{isinstance(q1, np.memmap)=}')
print(f'{isinstance(q1.value, np.memmap)=}')
print(f'{isinstance(q2, np.memmap)=}')
print(f'{isinstance(q2.value, np.memmap)=}')

It returns

isinstance(a, np.memmap)=True
isinstance(q1, np.memmap)=False
isinstance(q1.value, np.memmap)=False
isinstance(q2, np.memmap)=False
isinstance(q2.value, np.memmap)=False

Which from the memmap documentation indicates to me that the the data is no longer memmap.

Basically, I would like a way to pass a memmap array to a Quantity and that Quantity to use the array data through the memmap (ie the data on disk mapped via the memmap to the array gets updated when I update the Quantity). This includes a way to interrogate the Quantity to confirm it is or is not using a memmap.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
mhvkcommented, Oct 20, 2022

Indeed, not immediately. I briefly tried whether it would work to define a new class QuantityMemmap(u.Quantity, np.memmap) but that in itself did not seem good enough. I have not investigated why, though think in principle it should be possible to get it to work; it might need the kind of automatic class construction that was used for Distribution and Masked and which @nstarman has been wanting to generalize. With that, Quantity might also be able to wrap dask arrays, etc.

1reaction
mhvkcommented, Oct 20, 2022

In principle, it should be possible to make this work, since np.memmap is an ndarray subclass, but right now our use of ndarray is perhaps a bit too much built in. The suggestion you got from the documentation does work:

a = np.memmap('test', mode="w+", shape=(3,4), dtype='f8')
# Avoid myself making a mmap (should be quite easy, but just testing)
mm = np.ndarray(buffer=a.base, shape=(3, 4), dtype='f8')
q = u.Quantity(a, u.m, copy=False)

# Check state of a
a
memmap([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
# Set an element in q
q[-1, -1] = 1*u.cm
# Check whether this propagates to a.
a
memmap([[0.  , 0.  , 0.  , 0.  ],
        [0.  , 0.  , 0.  , 0.  ],
        [0.  , 0.  , 0.  , 0.01]])
# Check it is written to disk by exiting and logging in again
a = np.memmap('test', mode="r", shape=(3,4), dtype='f8')
# result is recovered
Read more comments on GitHub >

github_iconTop Results From Across the Web

Support operations necessary for memmap #832 - GitHub
Hello, I've started using fsspec for https://github.com/criteo/autofaiss and it works well. The use case is to read embeddings stored as ...
Read more >
numpy.memmap — NumPy v1.24 Manual
Create a memory-map to an array stored in a binary file on disk. ... If mode == 'r' and the number of remaining...
Read more >
mmap(2) - Linux manual page - man7.org
The mmap() call doesn't fail if the mapping cannot be populated (for example, due to limitations on the number of mapped huge pages...
Read more >
mmap()--Memory Map a File - IBM
The mmap() function is only supported for *TYPE2 stream files (*STMF) existing in the "root" (/), QOpenSys, and user-defined file systems.
Read more >
numpy memmap memory usage - want to iterate once
Fortunately, I recently found a solution that will allow you to iterate through the entire memmap array while capping the RAM usage. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found