question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why is epoch_length required by iterable dataset?

See original GitHub issue

I noticed that in version 0.4.0, epoch_length is mandatory for iterable dataset. I am curious about the rational behind it, since very often we don’t know the length of an iterable dataset beforehand and that’s why we use them.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9

github_iconTop GitHub Comments

2reactions
Yura52commented, Mar 31, 2020

The concept of “epoch” can usually be applied only to the training phase. And literally one hour ago I ran into a situation when my validation loader was a finite iterator and I wanted to run validation over this loader just “until it is exhausted”. In fact I had to calculate the exact number of batches (and I can think of situations when I couldn’t do it, what would I do then?) just in order to make the run method happy. This was inconvenient. However, I don’t have a good suggestion how to fix this. Probably, one may support something like epoch_length='inf'.

1reaction
snie2012commented, Mar 31, 2020

Essentially, the current implementation loses the information/control of how many times the data is actually iterated over. IMO, as @vfdev-5 mentioned, getting epoch_length automatically on the first StopIteration would be an ideal solution, since epoch_length can be helpful for purposes like visualization, but getting that length without iterating through it is challenging. Hope ignite can have that function!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Concepts — PyTorch-Ignite v0.4.10 Documentation
By default, epoch length is defined by len(data) . However, a user can also manually define the epoch length as a number of...
Read more >
webdataset PyTorch Model - Model Zoo
The recommended way of using IterableDataset with DataLoader is to do the batching explicitly in the Dataset . In addition, you need to...
Read more >
torchgeo.samplers - Read the Docs
For GeoDataset , dataset objects require a bounding box for indexing. ... This data loader will return 256x256 px images, and has an...
Read more >
Iterable dataset exhausts after a single epoch - Stack Overflow
If I'll go with iterable-style dataset - I need to create the Dataloader object at every epoch. So after each epoch the new...
Read more >
webdataset 0.1.37 - PyPI
WebDataset is a PyTorch Dataset (IterableDataset) implementation ... adopt because it does not actually require any kind of data conversion: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found