composing dataloaders
See original GitHub issueHi,
I have a datafetcher where I use 2 dataloaders in sequence: the first to translate from 1 ID to another, the second to fetch data corresponding to the second ID.
loader1.load(id1).thenCompose(id2 -> loader2.load(id2))
This hangs because dispatchAll() is not called again after loader1 completes. I can work around that by adding that call inside the thenCompose() lambda but then it is called for every id2 which is ugly at the very least.
Is there a better way of doing this?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:17 (4 by maintainers)
Top Results From Across the Web
Writing Custom Datasets, DataLoaders and Transforms
Writing Custom Datasets, DataLoaders and Transforms · Dataset class · Transforms · Iterating through the dataset · Afterword: torchvision · Docs · Tutorials....
Read more >DataLoaders - Composer
DataLoaders are used to pass in training or evaluation data to the Composer Trainer. There are three different ways of doing so: Passing...
Read more >Complete Guide to the DataLoader Class in PyTorch
This post covers the PyTorch dataloader class. We'll show how to load built-in and custom datasets in PyTorch, plus how to transform and...
Read more >Writing a Dataloader for a custom Dataset (Neural Network) in ...
Writing a Dataloader for a custom Dataset (Neural Network) in Pytorch. This blog is for programmers who have seen how Dataloaders are used ......
Read more >An Introduction to Datasets and DataLoader in PyTorch - Wandb
For most cases, we can get away by writing some key functions.. Implementing A Custom Dataset In PyTorch. Now ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
This is pretty much how the JS
tick
works for JavaScript data loaders. They can dispatch too early a well but never miss composed loaders because eventually control is passed back andtick
will happen.One thing I will say about the above us - since DataLoaders are per request, your scheduler Queue will grow to to the size of the number of concurrent requests * the number of dataloaders per request.
It’s good that you have a
removeRegistry
because otherwise this would get unwieldy quick with enough loadI’ve run into the same limitation and
ScheduledDataLoaderRegistry
didn’t work for me. I thinkScheduledDataLoaderRegistry
serves a different use case: to make the overall dispatching strategy less eager by pushing dispatch attempts into the future. This still relies ondispatchAll
to be called first, which doesn’t happen in the scenario with nested loaders.So my
ugly hackcurrent approach is to have a separate scheduled task periodically check all inflight data loaders and forcefully dispatch them if they haven’t been dispatched within a preset time window (e.g. 500ms). This works to unstuck nested data loaders, at the cost of naively triggering dispatches too early for long running data loading tasks. I am not sure what the implications of those are, but my API has been working fine so far so I’m happy 😎