question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Sort DataArray by data values along one dim

See original GitHub issue

.sortby() only supports sorting DataArray by coords values. I’m trying to sort one DataArray (cld) by data values along one dim and sort another DataArray (pair) by the same order.

MCVE Code Sample

import xarray as xr
import numpy as np

x = 4
y = 2
z = 4
data = np.arange(x*y*z).reshape(z, y, x)

# 3d array with coords
cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

# 2d array without coords
cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x'])

# expand 2d to 3d
cld_2 = cld_2.expand_dims(z=[4])

# concat
cld = xr.concat([cld_1, cld_2], dim='z')

# paired array
pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x))

print(cld)
print(pair)

Output

<xarray.DataArray (z: 5, y: 2, x: 4)>
array([[[ 0. ,  1. ,  2. ,  3. ],
        [ 4. ,  5. ,  6. ,  7. ]],

       [[ 8. ,  9. , 10. , 11. ],
        [12. , 13. , 14. , 15. ]],

       [[16. , 17. , 18. , 19. ],
        [20. , 21. , 22. , 23. ]],

       [[24. , 25. , 26. , 27. ],
        [28. , 29. , 30. , 31. ]],

       [[ 1. ,  2.5,  4. ,  5.5],
        [ 7. ,  8.5, 10. , 11.5]]])
Coordinates:
  * z        (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x

<xarray.DataArray (z: 5, y: 2, x: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]],

       [[24, 25, 26, 27],
        [28, 29, 30, 31]],

       [[32, 33, 34, 35],
        [36, 37, 38, 39]]])
Coordinates:
  * z        (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x

Problem Description

I’ve tried argsort(): cld.argsort(axis=0), but the result is wrong:

<xarray.DataArray (z: 5, y: 2, x: 4)>
array([[[0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[4, 4, 4, 4],
        [4, 4, 4, 4]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[2, 2, 2, 2],
        [2, 2, 2, 2]],

       [[3, 3, 3, 3],
        [3, 3, 3, 3]]], dtype=int64)
Coordinates:
  * z        (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10

github_iconTop GitHub Comments

1reaction
zxdawncommented, Oct 13, 2020

@JavierRuano I find the simpler solution from a similar question in stack overflow.

sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)

Complete example

import xarray as xr
import numpy as np

x = 4
y = 2
z = 4

data = np.arange(x*y*z).reshape(z, y, x)

# 3d array with coords
cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

# 2d array without coords
cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x'])

# expand 2d to 3d
cld_2 = cld_2.expand_dims(z=[4])

# concat
cld = xr.concat([cld_1, cld_2], dim='z')

# paired array
pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x))


sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)

print(cld)
print(pair)
print(sort_pair)

Output:

<xarray.DataArray (z: 5, y: 2, x: 4)>
array([[[ 0. ,  1. ,  2. ,  3. ],
        [ 4. ,  5. ,  6. ,  7. ]],

       [[ 8. ,  9. , 10. , 11. ],
        [12. , 13. , 14. , 15. ]],

       [[16. , 17. , 18. , 19. ],
        [20. , 21. , 22. , 23. ]],

       [[24. , 25. , 26. , 27. ],
        [28. , 29. , 30. , 31. ]],

       [[ 1. ,  2.5,  4. ,  5.5],
        [ 7. ,  8.5, 10. , 11.5]]])
Coordinates:
  * z        (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x
<xarray.DataArray (z: 5, y: 2, x: 4)>
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]],

       [[24, 25, 26, 27],
        [28, 29, 30, 31]],

       [[32, 33, 34, 35],
        [36, 37, 38, 39]]])
Coordinates:
  * z        (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x
[[[ 0  1  2  3]
  [ 4  5  6  7]]

 [[32 33 34 35]
  [36 37 38 39]]

 [[ 8  9 10 11]
  [12 13 14 15]]

 [[16 17 18 19]
  [20 21 22 23]]

 [[24 25 26 27]

Note, I have to use pair.values instead of pair in the last sorting step. Otherwise, I will get this error:

IndexError: Unlabeled multi-dimensional array cannot be used for indexing: y
1reaction
JavierRuanocommented, Apr 9, 2020

You could access directly to data as ndarray and you could transform dataarray into a dataframe of pandas. Pandas has sort_values. You searched sorting values according z, it is shown in z index.

With more dataArray you could read about Dataset concept…

but i dont develop xarray, i am only user of that module, perhaps you search another type of answer.

http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sortby.html according to values of 1-D dataarrays that share dimension with calling object.

El jue., 9 abr. 2020 4:22, Xin Zhang notifications@github.com escribió:

@JavierRuano https://github.com/JavierRuano Thank you very much. This example is a special case. If the order of z is different for each x and y, do we need to create a tmp DataArray to save the result of looping x and y ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3957#issuecomment-611291129, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIGDFO4TP6RSAK7CV3DJV73RLUWPRANCNFSM4MD6V32A .

Read more comments on GitHub >

github_iconTop Results From Across the Web

xarray.DataArray.sortby
Sorts the dataarray, either along specified dimensions, or according to values of 1-D dataarrays that share dimension with calling object.
Read more >
Python Xarray, sort by index or dimension? - Stack Overflow
DataArray object by one of its dimensions? In terms of usage, I'm thinking of something like data_array.sort(dim="dimension_name") .
Read more >
tidy3d.ScalarModeFieldDataArray - Flexcompute
tidy3d.ScalarModeFieldDataArray# ; head ([indexers]). Return a new DataArray whose data is given by the the first n values along the specified dimension(s).
Read more >
Working with multidimensional datasets in xarray - YouTube
Xarray introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for a more ...
Read more >
array_multisort - Manual - PHP
array_multisort — Sort multiple or multi-dimensional arrays ... For this example, each element in the data array represents one row in a table....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found