[feature request] Allow passing in image reading/loading function to old-style datasets constructors
See original GitHub issue🚀 The feature
This makes more sense now that torchvision is compiled with its own image reading functions (read_image
), so easier way to test pipelines without PIL would be nice
In addition, this would allow easy stubbing out image loading when wanted and solving https://github.com/pytorch/vision/issues/4975
Motivation, pitch
N/A
Alternatives
No response
Additional context
No response
cc @pmeier
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (1 by maintainers)
Top Results From Across the Web
Move image loading logic to _load_image in VOC datasets ...
[feature request] Allow passing in image reading/loading function to old-style datasets constructors #4991.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Unfortunately, I don’t think the change is simple. The functionality you propose is implemented in the new-style datasets, but we already ran into troubles and I see more to come in the future. I’ve opened #5075 for a discussion of how we want to handle decoding in the future. While we will find a solution for them for the new-style datasets I’m not eager to also maintain the same functionality in the old-style ones.
From your request I get that you only want to disable it and not use custom decoding. If don’t care about the actual data and only use it for testing, can’t you simply patch it out?
With the patch, iterating over the complete dataset takes ~1.5 seconds on my machine.
Yes, the new datasets will use the
decoder
parameter to do just that. You can pass any callable todatasets.load(..., decoder=)
that takes an open file handle and returns a tensor.torchvision.io.read_image
currently takes a path, so we are usingPIL
by default now.There are still some design choices to be made what a decoder is and what it should return. For example, how do we handle the case if more than one type of file need to be decoded? Furthermore, how do we handle the case of multiple return values for example the audio and video tensors after decoding a video.