question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use masked arrays while preserving int

See original GitHub issue

A great beauty of numpys masked arrays is that it works with any dtype, since it does not use nan. Unfortunately, when I try to put my data into an xarray.Dataset, it converts ints to float, as shown below:

In [137]: x = arange(30, dtype="i1").reshape(3, 10)

In [138]: xr.Dataset({"count": (["x", "y"], ma.masked_where(x%5>3, x))}, coords={"x": range(3), "y":
     ...: range(10)})
Out[138]:
<xarray.Dataset>
Dimensions:  (x: 3, y: 10)
Coordinates:
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * x        (x) int64 0 1 2
Data variables:
    count    (x, y) float64 0.0 1.0 2.0 3.0 nan 5.0 6.0 7.0 8.0 nan 10.0 ...

This happens in the function _maybe_promote.

Such type “promotion” is unaffordable for me; the memory consumption of my multi-gigabyte arrays would explode by a factor 4. Secondly, many of my integer-dtype fields are bit arrays, for which floating point representation is not desirable.

It would greatly benefit xarray if it could use masking while preserving the dtype of input data.

(See also: Stackoverflow question)

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:2
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
gerrithollcommented, Jan 24, 2019

@max-sixty Interesting! I wonder what it would take to make use of this “nullable integer data type” in xarray. It wouldn’t work to convert it to a standard numpy array (da.values) retaining the dtype, but one could make a new .to_maskedarray() method returning a numpy masked array; that would probably be easier than to add full support for masked arrays.

2reactions
gerrithollcommented, Jan 31, 2020
Read more comments on GitHub >

github_iconTop Results From Across the Web

xarray with masked arrays while preserving integer dtypes
Unfortunately, xarray does not support masked arrays or any form of integer dtypes with missing values. The reasons for this choice are the ......
Read more >
The numpy.ma module — NumPy v1.24 Manual
Masked arrays are arrays that may have missing or invalid entries. The numpy.ma module provides a nearly work-alike replacement for numpy that supports...
Read more >
Fluent NumPy. Let's uncover the practical details of… - Medium
Iterating over Arrays: Using nditer Iterator · Masked Arrays. NumPy's main object is the homogeneous multidimensional array.
Read more >
The numpy.ma module — NumPy v1.9 Manual
Constructing masked arrays¶ · A first possibility is to directly invoke the MaskedArray class. · A second possibility is to use the two...
Read more >
Missing data: masked arrays
In cases where everything is done using floating point, so missing values could be handled with Nan, masked arrays incur a speed penalty....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found