For how long does the meta batch data loader iterate?
See original GitHub issueUsually in python iterators stop when the StopIteration exception is raised.
But I saw that the length of the data loader is a strange number (where I expected infinity since it’s usually up to the user how many episodes they want to do, which usually corresponds to a batch of tasks sampled).
So when does the data loader stop?
Code I am referencing is from the example: https://github.com/tristandeleu/pytorch-meta/blob/master/examples/maml-higher/train.py
# Training loop
with tqdm(dataloader, total=args.num_batches) as pbar:
for batch_idx, batch in enumerate(pbar):
model.zero_grad()
is args.num_batches
the same as the number of episodes
?
the weird size I mentioned:
print(len(dataloader)) # prints 446400
Issue Analytics
- State:
- Created 3 years ago
- Reactions:5
- Comments:28 (12 by maintainers)
Top Results From Across the Web
How to iterate over Dataloader until a number of samples is ...
And the loader loops itself automatically until 800000 samples are seen. I think that I'd be a better way, than to calculate the...
Read more >Data loader takes a lot of time for every nth iteration
Well, as you can see, every 8th batch is slower while you are using 8 workers. Apparently it takes each worker more than...
Read more >Loops — PyTorch Lightning 1.8.5.post0 documentation
The EvaluationLoop is the top-level loop where validation/testing starts. It simply iterates over each evaluation dataloader from one to the next by calling ......
Read more >Using a Sequential DataLoader to Create a Training Loop
Help on function load_labels_stub in module metavision_ml.data.sequential_dataset: load_labels_stub(metadata, start_time, duration, tensor) This is a stub ...
Read more >SOQL For Loops | Apex Developer Guide
SOQL For Loop Formats ... SOQL for loops can process records one at a time using a single sObject variable, or in batches...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Like any vanilla PyTorch dataloader, the dataloader has size
len(dataset) // batch_size
, wherelen(dataset)
is the total number of tasks (C(4112, 5)
for 5-way Omniglot). That446400
is indeed surprising, because when I triedprint(len(dataloader))
in the maml-higher example, I get610069224856650
which looks reasonable.However since the dataloader is a combinatorially large, it is not recommended to loop over the whole dataloader (and reaching the
StopIteration
exception as you mention). That’s why you have theargs.num_batches
argument in the example, which loops overargs.num_batches
batches only (it breaks here).I am closing this issue, because the example is working as intended. Feel free to re-open it if you still get
len(dataloader) == 446400
.The random hash trick I was talking about would be at the level of the
Task
(see also here for the parent class), so this would require rewriting theSinusoid
/SinusoidTask
. I don’t think there is an easy way to monkeypatch the existing dataset to add the random hash.If having
seed=None
does the trick for you that’s good! I originally thought what you wanted to do was orthogonal to not having a fixed seed, which would have required the random hash trick.