question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

For how long does the meta batch data loader iterate?

See original GitHub issue

Usually in python iterators stop when the StopIteration exception is raised.

But I saw that the length of the data loader is a strange number (where I expected infinity since it’s usually up to the user how many episodes they want to do, which usually corresponds to a batch of tasks sampled).

So when does the data loader stop?

Code I am referencing is from the example: https://github.com/tristandeleu/pytorch-meta/blob/master/examples/maml-higher/train.py

    # Training loop
    with tqdm(dataloader, total=args.num_batches) as pbar:
        for batch_idx, batch in enumerate(pbar):
            model.zero_grad()

is args.num_batches the same as the number of episodes?

the weird size I mentioned:

        print(len(dataloader)) # prints 446400

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:5
  • Comments:28 (12 by maintainers)

github_iconTop GitHub Comments

4reactions
tristandeleucommented, Jul 7, 2020

Like any vanilla PyTorch dataloader, the dataloader has size len(dataset) // batch_size, where len(dataset) is the total number of tasks (C(4112, 5) for 5-way Omniglot). That 446400 is indeed surprising, because when I tried print(len(dataloader)) in the maml-higher example, I get 610069224856650 which looks reasonable.

However since the dataloader is a combinatorially large, it is not recommended to loop over the whole dataloader (and reaching the StopIteration exception as you mention). That’s why you have the args.num_batches argument in the example, which loops over args.num_batches batches only (it breaks here).

I am closing this issue, because the example is working as intended. Feel free to re-open it if you still get len(dataloader) == 446400.

1reaction
tristandeleucommented, Nov 18, 2020

The random hash trick I was talking about would be at the level of the Task (see also here for the parent class), so this would require rewriting the Sinusoid/SinusoidTask. I don’t think there is an easy way to monkeypatch the existing dataset to add the random hash.

If having seed=None does the trick for you that’s good! I originally thought what you wanted to do was orthogonal to not having a fixed seed, which would have required the random hash trick.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to iterate over Dataloader until a number of samples is ...
And the loader loops itself automatically until 800000 samples are seen. I think that I'd be a better way, than to calculate the...
Read more >
Data loader takes a lot of time for every nth iteration
Well, as you can see, every 8th batch is slower while you are using 8 workers. Apparently it takes each worker more than...
Read more >
Loops — PyTorch Lightning 1.8.5.post0 documentation
The EvaluationLoop is the top-level loop where validation/testing starts. It simply iterates over each evaluation dataloader from one to the next by calling ......
Read more >
Using a Sequential DataLoader to Create a Training Loop
Help on function load_labels_stub in module metavision_ml.data.sequential_dataset: load_labels_stub(metadata, start_time, duration, tensor) This is a stub ...
Read more >
SOQL For Loops | Apex Developer Guide
SOQL For Loop Formats ... SOQL for loops can process records one at a time using a single sObject variable, or in batches...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found