Why is epoch_length required by iterable dataset?
See original GitHub issueI noticed that in version 0.4.0, epoch_length
is mandatory for iterable dataset. I am curious about the rational behind it, since very often we don’t know the length of an iterable dataset beforehand and that’s why we use them.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9
Top Results From Across the Web
Concepts — PyTorch-Ignite v0.4.10 Documentation
By default, epoch length is defined by len(data) . However, a user can also manually define the epoch length as a number of...
Read more >webdataset PyTorch Model - Model Zoo
The recommended way of using IterableDataset with DataLoader is to do the batching explicitly in the Dataset . In addition, you need to...
Read more >torchgeo.samplers - Read the Docs
For GeoDataset , dataset objects require a bounding box for indexing. ... This data loader will return 256x256 px images, and has an...
Read more >Iterable dataset exhausts after a single epoch - Stack Overflow
If I'll go with iterable-style dataset - I need to create the Dataloader object at every epoch. So after each epoch the new...
Read more >webdataset 0.1.37 - PyPI
WebDataset is a PyTorch Dataset (IterableDataset) implementation ... adopt because it does not actually require any kind of data conversion: ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The concept of “epoch” can usually be applied only to the training phase. And literally one hour ago I ran into a situation when my validation loader was a finite iterator and I wanted to run validation over this loader just “until it is exhausted”. In fact I had to calculate the exact number of batches (and I can think of situations when I couldn’t do it, what would I do then?) just in order to make the
run
method happy. This was inconvenient. However, I don’t have a good suggestion how to fix this. Probably, one may support something likeepoch_length='inf'
.Essentially, the current implementation loses the information/control of how many times the data is actually iterated over. IMO, as @vfdev-5 mentioned, getting
epoch_length
automatically on the firstStopIteration
would be an ideal solution, sinceepoch_length
can be helpful for purposes like visualization, but getting that length without iterating through it is challenging. Hope ignite can have that function!