Add multi-threads support to samples crop
See original GitHub issueIs your feature request related to a problem? Please describe.
Currently, there are 4 crop transforms can generate a list of samples, and we crop the images in a for loop.
After some testing, I found that if executing in multi-threads, it can be much faster.
So it would be useful to add num_workers
support to these transforms, similar to the CacheDataset
.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:31 (31 by maintainers)
Top Results From Across the Web
Multithreading Programming Examples - NI
In Example 1, a data acquisition program in C creates and synchronizes the separate threads for acquiring, processing, and displaying the ...
Read more >Tasks and Parallelism: The New Wave of Multithreading
In this example, the GetStringLengthAsync method will execute its code when it is called and in the same thread as the caller. The...
Read more >Using WebAssembly threads from C, C++ and Rust - web.dev
Learn how to bring multithreaded applications written in other languages to WebAssembly.
Read more >How does incrementing a value across multiple threads in ...
For the first example, each thread has its own counter and it is incremented separately. Integer is immutable, when it is incremented the ......
Read more >Know Creating Threads and Multithreading in Java - Edureka
A thread is actually a lightweight process. Unlike many other computer languages, Java provides built-in support for multithreaded programming.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for the update, looks great, I think we can have a separate ticket to enhance the thread based loader.
I’m a bit late to the party but my few observations. Multi-threading has a number of problems relating to the GIL as we know but often we can route around that by using compiled functions in Numpy, Scipy, Pytorch, etc. Mixing threads and processes will be inefficient regardless because we typically create as many processes as we have CPU cores (virtual and physical). If we have multiple threads in these subprocesses running a transform pipeline there will be more threads than CPU cores and that will lose efficiency through contention. I don’t think the advantages of accessing memory efficiently in threads would overcome that. I generally would suggest using either threads or processes for parallelism and not to mix.
The problem here with
RandCropByPosNegLabeld
is that this is a one-to-many transform where generating the many with multiple threads may be faster. If this is used on its own this might be the case but if used with multiple processes you could create too many threads. I would think it would be faster to change the transform to be one-to-one so that you get one cropped image for each input but give each transform the same input, ie. if you had a batch of duplicate images. This might work for particular use cases that would expect one-to-many but I’m personally not sure how this class is used now so I can’t say for sure this makes sense.ThreadDataLoader
doesn’t benefit from having a buffer size larger than 1 typically unless the sizes of the buffered objects vary wildly and so take varying amounts of time to generate. I left the option to change the buffer size in the original implementation to allow experimentation. The idea ofThreadDataLoader
is to permit a separate thread to read from a data loader it had exclusive access to so that thread-safety wasn’t a problem.One idea I’ve been meaning to try is a DataLoader using threads instead of processes which would lack the interprocess communication overhead the current implementation must have, but would rely on using a lot of compiled functions to not get bogged down by the GIL and thread-safety of the source DataSet. This might be entirely unrelated to the problem at hand with the transforms being one-to-many.
I think @Nic-Ma has written something on results that make all this less meaningful but it’s here for us to consider for later.