question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Option to prevent automatic rechunking?

See original GitHub issue

In xhistogram #57 I’m trying to test a blockwise-based algorithm for various chunk shapes, and finding that in my test suite dask will change my tests by automatically rechunking and issuing a PerformanceWarning:

  /home/tegn500/Documents/Work/Code/xhistogram/xhistogram/core.py:334: 
  PerformanceWarning: Increasing number of chunks by factor of 100
    bin_counts = dsa.blockwise(

I would prefer for dask not to override me like this - in a test suite I’m much more concerned that the tests are run exactly the way I specify than I am concerned about performance.

Is there a global option to prevent this? My dask.config.config dictionary looks like this

{'version': 1,
 'temporary-directory': None,
 'dataframe': {'shuffle-compression': None},
 'array': {'svg': {'size': 120}, 'slicing': {'split-large-chunks': None}},
 'optimization': {'fuse': {'active': None,
   'ave-width': 1,
   'max-width': None,
   'max-height': inf,
   'max-depth-new-edges': None,
   'subgraphs': None,
   'rename-keys': True}}}

but I’m not sure if any of the options in the configuration reference will affect this.

It’s hard for me to know if my tests failing due to this or not. Some of my tests are failing, and when dask is automatically changing the test as it runs I don’t really know how to debug them. blockwise is dispatching to code we wrote so it’s plausible that the automatic rechunking is causing my test failures by switching to a chunking pattern which passes to a chunking pattern which fails.

The only issue I’ve seen that seems related is #4763 .

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
quasibencommented, May 26, 2021

@gjoseph92 your comment is fantastic! I’m not sure where this should go immediately but I think we should find a space in the docs to capture those clarifying thoughts long term

0reactions
TomNicholascommented, May 26, 2021

Thank you for that clarification @gjoseph92 , that’s extremely helpful.

It does make me wonder how my test input has even got unaligned chunks, but that’s something to be discussed in https://github.com/xgcm/xhistogram/pull/57 rather than here I guess.

Read more comments on GitHub >

github_iconTop Results From Across the Web

dask.array.rechunk - Dask documentation
The new block dimensions to create. -1 indicates the full size of the corresponding dimension. Default is “auto” which automatically determines chunk sizes....
Read more >
xr.concat: auto rechunking error · python-questions
It's compaining about time, which has cftime objects—but all datasets are identically chunked. Is there away to avoid triggering the auto ...
Read more >
How to update the shape, chunks and chunksize metadata of ...
How to update the shape, chunks and chunksize metadata of a dask array with nan dimensions ; "auto") Traceback (most recent call last):...
Read more >
Parallel computing with dask — xarray 0.10.2 documentation
This function will automatically concatenate and merge dataset into one in ... need to the compute the first few values (typically from the...
Read more >
hyperspy._signals.lazy module
flat_signal (bool) – returns each block flattened, such that the shape (for the ... If True (default), the data may be automatically rechunked...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found