open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2
See original GitHub issueI noticed a big speed discrepancy between xarray versions 0.8.2 and 0.9.1 when using open_mfdataset() on a dataset ~ 1.2 GB in size, consisting of 3 files and using netcdf4 as the engine. 0.8.2 was run first, so this is probably not a disk caching issue.
Test
import xarray as xr
import time
start_time = time.time()
ds0 = xr.open_mfdataset('./*.nc')
print("--- %s seconds ---" % (time.time() - start_time))
Result
xarray==0.8.2, dask==0.11.1, netcdf4==1.2.4
--- 0.736030101776 seconds ---
xarray==0.9.1, dask==0.13.0, netcdf4==1.2.4
--- 52.2800869942 seconds ---
Issue Analytics
- State:
- Created 7 years ago
- Comments:17 (11 by maintainers)
Top Results From Across the Web
open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2 -
I noticed a big speed discrepancy between xarray versions 0.8.2 and 0.9.1 when using open_mfdataset() on a dataset ~ 1.2 GB in size,...
Read more >pydata/xarray - Gitter
has anyone run into the following with open_mfdataset : ValueError: unable to decode time units u'hours since analysis' with the default calendar.
Read more >xarray load() running slowly on open_mfdataset() data
Yes, it is loading the values from the files that is taking most of time, but also, using xarray.open_mfdataset is much slower than...
Read more >whats-new.rst.txt - Xarray
Prevent passing `concat_dim` to :py:func:`xarray.open_mfdataset` when ... Significantly higher ``unstack`` performance on numpy-backed arrays which contain ...
Read more >xarray.open_mfdataset — xarray 0.8.2 documentation
concat_dim : str or DataArray or Index, optional. Dimension to concatenate files along. This argument is passed on to xarray.auto_combine() along with the ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Looks like it has been resolved! Tested with the latest pre-release v0.10.0rc2 on the dataset linked by najascutellatus above. https://marine.rutgers.edu/~michaesm/netcdf/data/
xarray==0.10.0rc2-1-g8267fdb dask==0.15.4
xarray==0.9.1 dask==0.13.0
@friedrichknuth, any chance you can take a look at this with the latest v0.10 release candidate?