open_mfdataset fails on variable attributes with 'list' type
See original GitHub issueUsing open_mfdataset on a series of netcdf files having variable attributes with type list
will fail with the following exception, when these attributes have different values from one file to another:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
ncf = xarray.open_mfdataset(files)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/backends/api.py", line 658, in open_mfdataset
ids=ids)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 553, in _auto_combine
data_vars=data_vars, coords=coords)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 474, in _combine_nd
compat=compat)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 492, in _auto_combine_all_along_first_dim
data_vars, coords)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 510, in _auto_combine_1d
for id, ds_group in grouped_by_vars]
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 368, in _auto_concat
return concat(datasets, dim=dim, data_vars=data_vars, coords=coords)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 122, in concat
return f(objs, dim, data_vars, coords, compat, positions)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 307, in _dataset_concat
combined = concat_vars(vars, dim, positions)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1982, in concat
return Variable.concat(variables, dim, positions, shortcut)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1433, in concat
utils.remove_incompatible_items(attrs, var.attrs)
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 184, in remove_incompatible_items
not compat(first_dict[k], second_dict[k]))):
File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 133, in equivalent
(pd.isnull(first) and pd.isnull(second)))
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
An example of such variable is provided below:
double sea_ice_fraction(time) ;
sea_ice_fraction:least_significant_digit = 2LL ;
sea_ice_fraction:_FillValue = 1.e+20 ;
sea_ice_fraction:long_name = "sea ice fraction" ;
sea_ice_fraction:standard_name = "sea_ice_fraction" ;
sea_ice_fraction:authority = "CF 1.7" ;
sea_ice_fraction:units = "1" ;
sea_ice_fraction:coverage_content_type = "auxiliaryInformation" ;
sea_ice_fraction:coordinates = "time lon lat" ;
sea_ice_fraction:source = "CCI Sea Ice" ;
sea_ice_fraction:institution = "ESA" ;
string sea_ice_fraction:source_files = "ice_conc_nh_ease2-250_cdr-v2p0_199912011200.nc", "ice_conc_sh_ease2-250_cdr-v2p0_199912011200.nc" ;
The exception will occur when the source_files
attribute have a different values in the file time series I am trying to concatenate. I had to use the preprocess
argument to remove first this attribute to avoid this exception.
This is caused by the equivalent
method in xarray/core/utils.py that does not account for this case:
def equivalent(first, second):
"""Compare two objects for equivalence (identity or equality), using
array_equiv if either object is an ndarray
"""
# TODO: refactor to avoid circular import
from . import duck_array_ops
if isinstance(first, np.ndarray) or isinstance(second, np.ndarray):
return duck_array_ops.array_equiv(first, second)
else:
return ((first is second) or
(first == second) or
(pd.isnull(first) and pd.isnull(second)))
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (5 by maintainers)
Top Results From Across the Web
xarray.open_mfdataset() doesn't work if dask.distributed client ...
This command concatenates variables along the "time" dimension, but only those that already contain the "time" dimension (data_vars='minimal', ...
Read more >How to use the xarray.open_mfdataset function in ... - Snyk
The file size is fine return xr.open_mfdataset(paths, concat_dim=concat_dim, **kwargs) divisor = sqrt(n_chunks) # Chunking will pretty much 'always' be 2x2, ...
Read more >whats-new.rst.txt - Xarray
Fixed "unhashable type" error trying to read NetCDF file with variable having its 'units' attribute not ``str`` (e.g. ``numpy.ndarray``) (:issue:`6368`).
Read more >Serialization and IO - xarray - Read the Docs
Restoring a pickle requires that the internal structure of the types for the ... These encodings are saved as attributes on the netCDF...
Read more >xarray.open_mfdataset — xarray 0.14.0 documentation
'different': Data variables which are not equal (ignoring attributes) across all datasets are also concatenated (as well as all for which dimension already ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@HasanAhmadQ7 You can have attributes of type list. See for instance this code:
Having files with different values for the
test_attr
attribute will causeopen_mfdataset
to fail.thanks for addressing this!
Thanks @HasanAhmadQ7, that would be fantastic !