One `imread` to rule them all
See original GitHub issueA lot of people have put a lot of effort into imread
lately. This is great, and it’s really helped. However, we’ve still got a way to go.
This is where I see the four major areas problems pop up in:
-
Read image data into Dask arrays accurately. We need more simple test cases here. Bug report: https://github.com/dask/dask-image/issues/220
-
Reduce confusion. Currently, there are multiple implementations of a dask
imread
function. The two most easily confused aredask_image.imread.imread()
anddask.array.image.imread()
. We need to figure out which is best, and only use that one. -
Read data in fast. For that, we’ll need to have some proper benchmarks, and run them routinely as part of the CI. This will help us decide (2) above. Previous discussion:
- Imread performance issue https://github.com/dask/dask-image/issues/181
- Getting movie files into Dask efficiently https://github.com/dask/dask-image/issues/134
-
Process the image data fast, too. For that to happen, we need smart default choices for how we chunk image data in dask arrays. Jackson Maxfield Brown describes the problem well in this short video here
Issue Analytics
- State:
- Created 2 years ago
- Reactions:3
- Comments:9 (1 by maintainers)
Top GitHub Comments
Yeah this comes up with large multipage TIFFs. They can be kind of movie-like
Wonder if we should just make the move to using ImageIO here with PR ( https://github.com/imageio/imageio/pull/739 ) in? It’s hard supporting all of the different file formats/use cases out there. Maybe a better separation of concerns would improve the user experience.
Edit: Also broadly related ( https://github.com/dask/dask/issues/9049 )
One big disadvantage for
dask.array.image.imread
is poor chunking behaviour. It looks like it makes a single chunk for every filename on disk. This is not greart for movie files or multislice tiffs, etc. where you probably don’t want to load the whole movie file into RAM.See https://github.com/dask/dask-image/issues/262#issuecomment-1125063820