open_mfdataset cannot open multiple netcdf files written by NASA/GEOS MAPL v1.0.0 that contain data on a cubed-sphere grid
See original GitHub issueMCVE Code Sample
First download these files:
- http://ftp.as.harvard.edu/gcgrid/geos-chem/1mo_benchmarks/GC_12/12.5.0/GCHP/diagnostics/GCHP.AerosolMass.20160716_1200z.nc4
- http://ftp.as.harvard.edu/gcgrid/geos-chem/1mo_benchmarks/GC_12/12.5.0/GCHP/diagnostics/GCHP.SpeciesConc.20160716_1200z.nc4
Then run this code:
import xarray as xr
filelist = ['GCHP.SpeciesConc.20160716_1200z.nc4', 'GCHP.AerosolMass.20160716_1200z.nc4']
ds = xr.open_mfdataset(filelist)
print(ds)
Expected Output
This should load data from both files into a single xarray Dataset object and print its contents.
Problem Description
Instead, this error occurs;
File "./run_1mo_benchmark.py", line 479, in <module>
ds = xr.open_mfdataset(filelist])
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/backends/api.py", line 719, in open_mfdataset
ids=ids)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 553, in _auto_combine
data_vars=data_vars, coords=coords)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 475, in _combine_nd
compat=compat)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 493, in _auto_combine_all_along_first_dim
data_vars, coords)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 514, in _auto_combine_1d
merged = merge(concatenated, compat=compat)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 532, in merge
variables, coord_names, dims = merge_core(dict_like_objects, compat, join)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 451, in merge_core
variables = merge_variables(expanded, priority_vars, compat=compat)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 170, in merge_variables
merged[name] = unique_variable(name, var_list, compat)
File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 90, in unique_variable
% (name, out, var))
xarray.core.merge.MergeError: conflicting values for variable 'anchor' on objects to be combined:
first value: <xarray.Variable (nf: 6, ncontact: 4)>
dask.array<shape=(6, 4, 4), dtype=int32, chunksize=(6, 4, 4)>
Attributes:
long_name: anchor point
second value: <xarray.Variable (nf: 6, ncontact: 4)>
dask.array<shape=(6, 4, 4), dtype=int32, chunksize=(6, 4, 4)>
Attributes:
long_name: anchor point
It seems to get hung up on trying to merge the “anchor” variable. As a workaround, if I drop the “anchor” variable from both datasets and then use xr.open_mfdataset, then the merge works properly.
Output of xr.show_versions()
xarray: 0.12.1 pandas: 0.25.1 numpy: 1.16.4 scipy: 1.3.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.6.2 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.3.0 distributed: 2.3.2 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 setuptools: 41.2.0 pip: 19.2.2 conda: None pytest: 4.2.0 IPython: 7.7.0 sphinx: 2.1.2
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
@jhamman Hahaha.
Closing as a duplicate of #1378