question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

One `imread` to rule them all

See original GitHub issue

A lot of people have put a lot of effort into imread lately. This is great, and it’s really helped. However, we’ve still got a way to go.

This is where I see the four major areas problems pop up in:

  1. Read image data into Dask arrays accurately. We need more simple test cases here. Bug report: https://github.com/dask/dask-image/issues/220

  2. Reduce confusion. Currently, there are multiple implementations of a dask imread function. The two most easily confused are dask_image.imread.imread() and dask.array.image.imread(). We need to figure out which is best, and only use that one.

  3. Read data in fast. For that, we’ll need to have some proper benchmarks, and run them routinely as part of the CI. This will help us decide (2) above. Previous discussion:

  4. Process the image data fast, too. For that to happen, we need smart default choices for how we chunk image data in dask arrays. Jackson Maxfield Brown describes the problem well in this short video here

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:3
  • Comments:9 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
jakirkhamcommented, May 13, 2022

Yeah this comes up with large multipage TIFFs. They can be kind of movie-like

Wonder if we should just make the move to using ImageIO here with PR ( https://github.com/imageio/imageio/pull/739 ) in? It’s hard supporting all of the different file formats/use cases out there. Maybe a better separation of concerns would improve the user experience.

Edit: Also broadly related ( https://github.com/dask/dask/issues/9049 )

0reactions
GenevieveBuckleycommented, May 13, 2022

One big disadvantage for dask.array.image.imread is poor chunking behaviour. It looks like it makes a single chunk for every filename on disk. This is not greart for movie files or multislice tiffs, etc. where you probably don’t want to load the whole movie file into RAM.

See https://github.com/dask/dask-image/issues/262#issuecomment-1125063820

Read more comments on GitHub >

github_iconTop Results From Across the Web

dask_image imread performance issue #181 - GitHub
In the following example, I am reading 398 images, all of them with the same dimension (64x10240, uint16). Taking into account the dimensions...
Read more >
Read Multiple images on a folder in OpenCv (python)
This will get all the files in a folder in onlyfiles . And then it will read them all and store them in...
Read more >
OpenCV Image Operations using Python | by Devang Dayal
Refer to the Documentations and Try them all. Let us First Import the OpenCV Library and the Image on which we will perform...
Read more >
One ring to rule them all, One ring to find them - Pinterest
One ring to rule them all, One ring to find them | Lord of the rings, The hobbit, Lord.
Read more >
Matlab Imread | Learn the different examples of ... - eduCBA
' Since we have not passed any format argument in this syntax, it will infer the format from the file's contents. In case...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found