question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dask_image.imread.imread regression

See original GitHub issue

It appears that the more recent upgrades to dask-image’s imread have broken the use case of reading multiple 3D tiffs. (Probably the problem is more general than that, but this is my test case.) To reproduce:

import numpy as np
from dask_image.imread import imread
import tifffile
import tempfile
from skimage import data


blobs = data.binary_blobs(64, n_dim=3)
with tempfile.TemporaryDirectory() as tmpdir:
    for i in range(5):
        tifffile.imsave(os.path.join(tmpdir, f'{i}.tiff'), blobs)
    image = imread(os.path.join(tmpdir, '*.tiff'))
    print(image)
    timepoint = np.asarray(image[0])

Produces the following print output:

dask.array<_map_read_frame, shape=(5, 64, 64, 64), dtype=bool, chunksize=(1, 64, 64, 64), chunktype=numpy.ndarray>

And this traceback:

IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-7-7f255eff488b> in <module>
      4     image = imread(os.path.join(tmpdir, '*.tiff'))
      5     print(image)
----> 6     timepoint = np.asarray(image[0])
      7 

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order, like)
    100         return _asarray_with_like(a, dtype=dtype, order=order, like=like)
    101 
--> 102     return array(a, dtype, copy=False, order=order)
    103 
    104 

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/array/core.py in __array__(self, dtype, **kwargs)
   1474 
   1475     def __array__(self, dtype=None, **kwargs):
-> 1476         x = self.compute()
   1477         if dtype and x.dtype != dtype:
   1478             x = x.astype(dtype)

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/base.py in compute(self, **kwargs)
    283         dask.base.compute
    284         """
--> 285         (result,) = compute(self, traverse=False, **kwargs)
    286         return result
    287 

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/base.py in compute(*args, **kwargs)
    565         postcomputes.append(x.__dask_postcompute__())
    566 
--> 567     results = schedule(dsk, keys, **kwargs)
    568     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    569 

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
     77             pool = MultiprocessingPoolExecutor(pool)
     78 
---> 79     results = get_async(
     80         pool.submit,
     81         pool._max_workers,

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
    512                             _execute_task(task, data)  # Re-execute locally
    513                         else:
--> 514                             raise_exception(exc, tb)
    515                     res, worker_id = loads(res_info)
    516                     state["cache"][key] = res

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in reraise(exc, tb)
    323     if exc.__traceback__ is not tb:
    324         raise exc.with_traceback(tb)
--> 325     raise exc
    326 
    327 

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
    221     try:
    222         task, data = loads(task_info)
--> 223         result = _execute_task(task, data)
    224         id = get_id()
    225         result = dumps((result, id))

~/miniconda3/envs/new-dask-image/lib/python3.9/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
    119         # temporaries by their reference count and can execute certain
    120         # operations in-place.
--> 121         return func(*(_execute_task(a, cache) for a in args))
    122     elif not ishashable(arg):
    123         return arg

IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed

This works fine with dask-image 0.5.0.

Environment:

  • Dask version: 2021.04.1
  • Python version: 3.9.4
  • Operating System: Linux
  • Install method (conda, pip, source): pip

CC @DragaDoncila, who first discovered the bug.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jnicommented, Nov 16, 2021

@VolkerH to get over the performance issue with da.stack, here’s my example for loading many tiffs using map_blocks:

https://github.com/jni/napari-demos/blob/6860b1fe86a51b30874c34150ae216f4c39b2dd6/rootomics.py#L20-L54

1reaction
GenevieveBuckleycommented, Nov 8, 2021

Ii believe the current workarounds were:

  1. Revert to an earlier version of dask-image, or
  2. Try from dask.array.image import imread

It’s on my to-do list to benchmark all these variations and fix the regression. (No timeframe on that, though)

Read more comments on GitHub >

github_iconTop Results From Across the Web

dask_image.imread package - Dask-Image
Provides a simple, fast mechanism to ingest image data into a Dask Array. Parameters. fname (str or pathlib.Path) – A glob like string...
Read more >
tifffile - PyPI
store NumPy arrays in TIFF (Tagged Image File Format) files, and. read image and metadata from ... Fix regression using imread out argument...
Read more >
Dask: Parallelize Everything - Medium
Dask -Image is developed with one specific goal in mind: simplifying ... It is .imread() , the same name used in many other...
Read more >
Blurring an image | Python Data Analysis - Third Edition
Blurring is one of the crucial steps of image preprocessing. ... image = cv2.imread('tajmahal.jpg') # Convert image color space BGR to RGB rgb_image=cv2....
Read more >
Excercises - Kaggle
Python · Google Scraped Image Dataset, UCI ML Drug Review dataset, ... %matplotlib inline # Read image using imread () function image =...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found