question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

open_mfdataset fails on variable attributes with 'list' type

See original GitHub issue

Using open_mfdataset on a series of netcdf files having variable attributes with type list will fail with the following exception, when these attributes have different values from one file to another:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
ncf = xarray.open_mfdataset(files) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/backends/api.py", line 658, in open_mfdataset ids=ids) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 553, in _auto_combine data_vars=data_vars, coords=coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 474, in _combine_nd compat=compat) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 492, in _auto_combine_all_along_first_dim data_vars, coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 510, in _auto_combine_1d for id, ds_group in grouped_by_vars] File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 368, in _auto_concat return concat(datasets, dim=dim, data_vars=data_vars, coords=coords) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 122, in concat return f(objs, dim, data_vars, coords, compat, positions) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/combine.py", line 307, in _dataset_concat combined = concat_vars(vars, dim, positions) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1982, in concat return Variable.concat(variables, dim, positions, shortcut) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/variable.py", line 1433, in concat utils.remove_incompatible_items(attrs, var.attrs) File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 184, in remove_incompatible_items not compat(first_dict[k], second_dict[k]))): File "/home/ananda/jfpiolle/miniconda2/envs/cerbere/lib/python2.7/site-packages/xarray/core/utils.py", line 133, in equivalent (pd.isnull(first) and pd.isnull(second))) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

An example of such variable is provided below:

	double sea_ice_fraction(time) ;
		sea_ice_fraction:least_significant_digit = 2LL ;
		sea_ice_fraction:_FillValue = 1.e+20 ;
		sea_ice_fraction:long_name = "sea ice fraction" ;
		sea_ice_fraction:standard_name = "sea_ice_fraction" ;
		sea_ice_fraction:authority = "CF 1.7" ;
		sea_ice_fraction:units = "1" ;
		sea_ice_fraction:coverage_content_type = "auxiliaryInformation" ;
		sea_ice_fraction:coordinates = "time lon lat" ;
		sea_ice_fraction:source = "CCI Sea Ice" ;
		sea_ice_fraction:institution = "ESA" ;
		string sea_ice_fraction:source_files = "ice_conc_nh_ease2-250_cdr-v2p0_199912011200.nc", "ice_conc_sh_ease2-250_cdr-v2p0_199912011200.nc" ;

The exception will occur when the source_files attribute have a different values in the file time series I am trying to concatenate. I had to use the preprocess argument to remove first this attribute to avoid this exception.

This is caused by the equivalent method in xarray/core/utils.py that does not account for this case:

def equivalent(first, second):
    """Compare two objects for equivalence (identity or equality), using
    array_equiv if either object is an ndarray
    """
    # TODO: refactor to avoid circular import
    from . import duck_array_ops
    if isinstance(first, np.ndarray) or isinstance(second, np.ndarray):
        return duck_array_ops.array_equiv(first, second)
    else:
        return ((first is second) or
                (first == second) or
                (pd.isnull(first) and pd.isnull(second)))

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jfpiollecommented, Jul 31, 2019

@HasanAhmadQ7 You can have attributes of type list. See for instance this code:

from netCDF4 import Dataset
import numpy

f = Dataset("example.nc","w")
x = f.createDimension("x",3)
vlvar = f.createVariable("test_var", numpy.int32, ("x"))

# here create an attribute as a list
vlvar.test_attr = ["string a", "string b"]

vlvar[:] = numpy.arange(3)
f.close()

Having files with different values for the test_attr attribute will cause open_mfdataset to fail.

thanks for addressing this!

1reaction
shoyercommented, Jul 23, 2019

Thanks @HasanAhmadQ7, that would be fantastic !

Read more comments on GitHub >

github_iconTop Results From Across the Web

xarray.open_mfdataset() doesn't work if dask.distributed client ...
This command concatenates variables along the "time" dimension, but only those that already contain the "time" dimension (data_vars='minimal', ...
Read more >
How to use the xarray.open_mfdataset function in ... - Snyk
The file size is fine return xr.open_mfdataset(paths, concat_dim=concat_dim, **kwargs) divisor = sqrt(n_chunks) # Chunking will pretty much 'always' be 2x2, ...
Read more >
whats-new.rst.txt - Xarray
Fixed "unhashable type" error trying to read NetCDF file with variable having its 'units' attribute not ``str`` (e.g. ``numpy.ndarray``) (:issue:`6368`).
Read more >
Serialization and IO - xarray - Read the Docs
Restoring a pickle requires that the internal structure of the types for the ... These encodings are saved as attributes on the netCDF...
Read more >
xarray.open_mfdataset — xarray 0.14.0 documentation
'different': Data variables which are not equal (ignoring attributes) across all datasets are also concatenated (as well as all for which dimension already ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found