question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

expand_dims() modifies numpy.ndarray.flags to write only, upon manually reverting this flag back, attempting to set a single inner value using .loc will instead set all of the inner array values

See original GitHub issue

I am using the newly updated expand_dims API that was recently updated with this PR https://github.com/pydata/xarray/pull/2757. However the flag setting behaviour can also be observed using the old API syntax.

>>> expanded_da = xr.DataArray(np.random.rand(3,3), coords={'x': np.arange(3), 'y': np.arange(3)}, dims=('x', 'y')) # Create a 2D DataArray
>>> expanded_da
<xarray.DataArray (x: 3, y: 3)>
array([[0.148579, 0.463005, 0.224993],
       [0.633511, 0.056746, 0.28119 ],
       [0.390596, 0.298519, 0.286853]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2

>>> expanded_da.data.flags # Check current state of numpy flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.loc[0, 0] = 2.22 # Set a single value before expanding
>>> expanded_da # It works, the single value is set
<xarray.DataArray (x: 3, y: 3)>
array([[2.22    , 0.463005, 0.224993],
       [0.633511, 0.056746, 0.28119 ],
       [0.390596, 0.298519, 0.286853]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2

>>> expanded_da = expanded_da.expand_dims({'z': 3}, -1) # Add a new dimension 'z'
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 2.22    , 2.22    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
Dimensions without coordinates: z

>>> expanded_da['z'] = np.arange(3) # Add new coordinates to the new dimension 'z'
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 2.22    , 2.22    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

>>> expanded_da.loc[0, 0, 0] = 9.99 # Attempt to set a single value, get 'read-only' error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/dataarray.py", line 113, in __setitem__
    self.data_array[pos_indexers] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/dataarray.py", line 494, in __setitem__
    self.variable[key] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/variable.py", line 714, in __setitem__
    indexable[index_tuple] = value
  File "/Users/dhemming/.ve/unidata_notebooks/lib/python3.6/site-packages/xarray/core/indexing.py", line 1174, in __setitem__
    array[key] = value
ValueError: assignment destination is read-only

>>> expanded_da.data.flags # Check flags on the DataArray, notice they have changed
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.data.setflags(write = 1) # Make array writeable again
>>> expanded_da.data.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> expanded_da.loc[0, 0, 0] # Check the value I want to overwrite
<xarray.DataArray ()>
array(2.22)
Coordinates:
    x        int64 0
    y        int64 0
    z        int64 0

>>> expanded_da.loc[0, 0, 0] = 9.99 # Attempt to overwrite single value, instead it overwrites all values in the array located at [0, 0]
>>> expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[9.99    , 9.99    , 9.99    ],
        [0.463005, 0.463005, 0.463005],
        [0.224993, 0.224993, 0.224993]],

       [[0.633511, 0.633511, 0.633511],
        [0.056746, 0.056746, 0.056746],
        [0.28119 , 0.28119 , 0.28119 ]],

       [[0.390596, 0.390596, 0.390596],
        [0.298519, 0.298519, 0.298519],
        [0.286853, 0.286853, 0.286853]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

Problem description

When applying the operation ‘expand_dims({‘z’: 3}, -1)’ on a DataArray the underlying Numpy array flags are changed. ‘C_CONTIGUOUS’ is set to False, and ‘WRITEABLE’ is set to False, and ‘OWNDATA’ is set to False. Upon changing ‘WRITEABLE’ back to True, when I try to set a single value in the DataArray using the ‘.loc’ operator it will instead set all the values in that selected inner array.

I am new to Xarray so I can’t be entirely sure if this expected behaviour. Regardless I would expect that adding a new dimension to the array would not make that array ‘read-only’. I would also not expect the ‘.loc’ method to work differently to how it would otherwise.

It’s also not congruent with the Numpy ‘expand_dims’ operation. Because when I call the operation ‘np.expand_dims(np_arr, axis=-1)’ the 'C_CONTIGUOUS ’ and 'WRITEABLE ’ flags will not be modified.

Expected Output

Here is a similar flow of operations that demonstrates the behaviour I would expect from the DataArray after applying ‘expand_dims’:

>>> non_expanded_da = xr.DataArray(np.random.rand(3,3,3), coords={'x': np.arange(3), 'y': np.arange(3)}, dims=('x', 'y', 'z')) # Create the new DataArray to be in the same state as I would expect it to be in after applying the operation 'expand_dims({'z': 3}, -1)'
>>> non_expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[0.017221, 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
Dimensions without coordinates: z

>>> non_expanded_da.data.flags # Check flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

>>> non_expanded_da['z'] = np.arange(3) # Set coordinate for dimension 'z'
>>> non_expanded_da
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[0.017221, 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

>>> non_expanded_da.loc[0, 0, 0] = 2.22 # Set value using .loc method
>>> non_expanded_da # The single value referenced is set which is what I expect to happen
<xarray.DataArray (x: 3, y: 3, z: 3)>
array([[[2.22    , 0.374267, 0.231979],
        [0.678884, 0.512903, 0.737573],
        [0.985872, 0.1373  , 0.4603  ]],

       [[0.764227, 0.825059, 0.847694],
        [0.482841, 0.708206, 0.486576],
        [0.726265, 0.860627, 0.435101]],

       [[0.117904, 0.40569 , 0.274288],
        [0.079321, 0.647562, 0.847459],
        [0.57494 , 0.578745, 0.125309]]])
Coordinates:
  * x        (x) int64 0 1 2
  * y        (y) int64 0 1 2
  * z        (z) int64 0 1 2

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 (default, Dec 29 2018, 12:05:36) [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] python-bits: 64 OS: Darwin OS-release: 18.2.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8 libhdf5: 1.10.2 libnetcdf: 4.4.1.1

xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.5.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: 0.9.6.1.post1 iris: None bottleneck: None dask: None distributed: None matplotlib: 3.0.3 cartopy: 0.17.0 seaborn: None setuptools: 39.0.1 pip: 10.0.1 conda: None pytest: None IPython: None sphinx: None

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
shoyercommented, Apr 13, 2019

As you’ve noticed, these arrays are “read only” because otherwise indexing can modify more than the original values, e.g., consider:

original = xr.DataArray(np.zeros(3), dims='x')
result = original.expand_dims(y=2)
result.data.flags.writeable = True
result[0, 0] = 1
print(result)

Both “y” values were set to 1!

<xarray.DataArray (y: 2, x: 3)>
array([[1., 0., 0.],
       [1., 0., 0.]])
Dimensions without coordinates: y, x

The work around is to call .copy() on the array after calling expand_dims(), e.g.,

original = xr.DataArray(np.zeros(3), dims='x')
result = original.expand_dims(y=2).copy()
result[0, 0] = 1
print(result)

Now the correct result is printed:

<xarray.DataArray (y: 2, x: 3)>
array([[1., 0., 0.],
       [0., 0., 0.]])
Dimensions without coordinates: y, x

This is indeed intended behavior: by making the result read-only, we can expand dimensions without copying the original data, and without needing to worry about indexing modifying the wrong values.

That said, we could certainly improve the usability of this feature in xarray. Some options:

  • Mention the work-around of using .copy() in the error message xarray prints when an array is read-only.
  • Add a copy argument to expand_dims, so users can write copy=True if they want a writeable result.
  • Consider changing the behavior when a dimension is inserted with size 1 to make the result writeable – in this case, individual elements of result can be modified (which will also modify the input array). But maybe it would be surprising to users if the result of expand_dims() is sometimes but not always writeable?
1reaction
floogitcommented, Apr 17, 2019

I have just realized that .copy(deep=True) is a possible fix for datasets.

Read more comments on GitHub >

github_iconTop Results From Across the Web

numpy.ndarray.flags — NumPy v1.24 Manual
The array flags cannot be set arbitrarily: WRITEBACKIFCOPY can only be set False . ALIGNED can only be set True if the data ......
Read more >
Add new dimensions to ndarray (np.newaxis, np.expand_dims)
You can add new dimensions to a NumPy array ndarray (= unsqueeze a NumPy array) with np.newaxis , np.expand_dims() and np.reshape() (or ...
Read more >
Debugging Numpy VisibleDeprecationWarning (ndarray from ...
I'd like to figure out which call resulted in VisibleDeprecationWarning in my own code or a call from Pandas. How would I be...
Read more >
jax.numpy package - JAX documentation - Read the Docs
JAX versions of such functions will return copies instead, although such are often optimized away by ... Assemble an nd-array from nested lists...
Read more >
Browse Python Technical Problem Clusters - Code Grepper
Get code examples instantly right from your google search results with the Grepper Chrome Extension.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found