dask_image.imread.imread regression
See original GitHub issueIt appears that the more recent upgrades to dask-image’s imread have broken the use case of reading multiple 3D tiffs. (Probably the problem is more general than that, but this is my test case.) To reproduce:
import numpy as np
from dask_image.imread import imread
import tifffile
import tempfile
from skimage import data
blobs = data.binary_blobs(64, n_dim=3)
with tempfile.TemporaryDirectory() as tmpdir:
for i in range(5):
tifffile.imsave(os.path.join(tmpdir, f'{i}.tiff'), blobs)
image = imread(os.path.join(tmpdir, '*.tiff'))
print(image)
timepoint = np.asarray(image[0])
Produces the following print output:
dask.array<_map_read_frame, shape=(5, 64, 64, 64), dtype=bool, chunksize=(1, 64, 64, 64), chunktype=numpy.ndarray>
And this traceback:
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-7f255eff488b> in <module>
4 image = imread(os.path.join(tmpdir, '*.tiff'))
5 print(image)
----> 6 timepoint = np.asarray(image[0])
7
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order, like)
100 return _asarray_with_like(a, dtype=dtype, order=order, like=like)
101
--> 102 return array(a, dtype, copy=False, order=order)
103
104
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/array/core.py in __array__(self, dtype, **kwargs)
1474
1475 def __array__(self, dtype=None, **kwargs):
-> 1476 x = self.compute()
1477 if dtype and x.dtype != dtype:
1478 x = x.astype(dtype)
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/base.py in compute(self, **kwargs)
283 dask.base.compute
284 """
--> 285 (result,) = compute(self, traverse=False, **kwargs)
286 return result
287
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/base.py in compute(*args, **kwargs)
565 postcomputes.append(x.__dask_postcompute__())
566
--> 567 results = schedule(dsk, keys, **kwargs)
568 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
569
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
77 pool = MultiprocessingPoolExecutor(pool)
78
---> 79 results = get_async(
80 pool.submit,
81 pool._max_workers,
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
512 _execute_task(task, data) # Re-execute locally
513 else:
--> 514 raise_exception(exc, tb)
515 res, worker_id = loads(res_info)
516 state["cache"][key] = res
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in reraise(exc, tb)
323 if exc.__traceback__ is not tb:
324 raise exc.with_traceback(tb)
--> 325 raise exc
326
327
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
221 try:
222 task, data = loads(task_info)
--> 223 result = _execute_task(task, data)
224 id = get_id()
225 result = dumps((result, id))
~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
119 # temporaries by their reference count and can execute certain
120 # operations in-place.
--> 121 return func(*(_execute_task(a, cache) for a in args))
122 elif not ishashable(arg):
123 return arg
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
This works fine with dask-image 0.5.0.
Environment:
- Dask version: 2021.04.1
- Python version: 3.9.4
- Operating System: Linux
- Install method (conda, pip, source): pip
CC @DragaDoncila, who first discovered the bug.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:10 (5 by maintainers)
Top Results From Across the Web
dask_image.imread package - Dask-Image
Provides a simple, fast mechanism to ingest image data into a Dask Array. Parameters. fname (str or pathlib.Path) – A glob like string...
Read more >tifffile - PyPI
store NumPy arrays in TIFF (Tagged Image File Format) files, and. read image and metadata from ... Fix regression using imread out argument...
Read more >Dask: Parallelize Everything - Medium
Dask -Image is developed with one specific goal in mind: simplifying ... It is .imread() , the same name used in many other...
Read more >Blurring an image | Python Data Analysis - Third Edition
Blurring is one of the crucial steps of image preprocessing. ... image = cv2.imread('tajmahal.jpg') # Convert image color space BGR to RGB rgb_image=cv2....
Read more >Excercises - Kaggle
Python · Google Scraped Image Dataset, UCI ML Drug Review dataset, ... %matplotlib inline # Read image using imread () function image =...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@VolkerH to get over the performance issue with
da.stack
, here’s my example for loading many tiffs using map_blocks:https://github.com/jni/napari-demos/blob/6860b1fe86a51b30874c34150ae216f4c39b2dd6/rootomics.py#L20-L54
Ii believe the current workarounds were:
from dask.array.image import imread
It’s on my to-do list to benchmark all these variations and fix the regression. (No timeframe on that, though)