question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Running trainer for more than one epoch with torch.utils.data.IterableDataset

See original GitHub issue

πŸ› Bug description

Hello, I don’t know if I am missing something but I am currently trying to run my trainer for more than 1 epoch where each epoch has 5 iterations. However there is not a new iterator initialized over the dataloader and you get the warning below.

tensor([1])
Epoch [1/3]: [1/5]  20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š                                                                                                        [00:00<00:00]/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ignite/contrib/handlers/base_logger.py:124: UserWarning: Provided metric name 'loss' is missing in engine's state metrics: []
  "in engine's state metrics: {}".format(name, list(engine.state.metrics.keys()))
tensor([4])
Epoch [1/3]: [1/5]  20%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š                                                                                                        [00:00<00:00]tensor([9])
Epoch [1/3]: [2/5]  40%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ                                                                              [00:00<00:00]tensor([16])
Epoch [1/3]: [3/5]  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–                                                    [00:00<00:00]tensor([25])
Epoch [1/3]: [5/5] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ [00:00<00:00]
/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ignite/engine/engine.py:465: UserWarning: Data iterator can not provide data anymore but required total number of iterations to run is not reached. Current iteration: 5 vs Total iterations to run : 15
  self.state.iteration, self.state.epoch_length * self.state.max_epochs

Now if I do not use the ignite package and write the loop to iterate over the number of epochs it works fine. Here is a code sample I have pertaining to my use case.

import torch
from ignite.engine import Engine, Events
from ignite.contrib.handlers import ProgressBar
import pdb

class DatasetUtils:
    def __init__(self, datapoints):
        self.datapoints = datapoints

    def load_data(self):
        for d in self.datapoints:
            yield d

    def create_examples(self, data):
        for datapoint in data:
            yield datapoint**2

    def __len__(self):
        return len(self.datapoints)


class MyIterableDataset(torch.utils.data.IterableDataset):
    def __init__(self, helper):
        self.helper = helper

    def __iter__(self):
        data_iter = self.helper.load_data()
        for example in self.helper.create_examples(data_iter):
            yield example

    def __len__(self):
        return len(self.helper)

datapoints = [1, 2, 3, 4, 5]
helper = DatasetUtils(datapoints)

ds = MyIterableDataset(helper)

data_loader = torch.utils.data.DataLoader(ds, num_workers=1, batch_size=1)


def update(engine, batch):
    print(batch)

# Using ignite 
trainer = Engine(update)
pbar = ProgressBar(persist=True)
pbar.attach(trainer, metric_names=["loss"])
trainer.run(map(lambda x: x, data_loader), epoch_length=5//1, max_epochs=3)


# Not using ignite
for epoch in range(3):
    data_iter = map(lambda x: x, data_loader)
    while True:
        try:
            batch = next(data_iter)
            print(batch)
        except StopIteration:
            break

Thank you

Environment

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
sdesroziscommented, Apr 21, 2020

@vfdev-5 You are so fast πŸ˜ƒ

2reactions
vfdev-5commented, Apr 21, 2020

@bhedayat you can add a restart iterator in epoch completed handler as in your code snippet:

# Using ignite 
trainer = Engine(update)
pbar = ProgressBar(persist=True)
pbar.attach(trainer, metric_names=["loss"])

@trainer.on(Events.ITERATION_COMPLETED(every=5))
def restart_dataloader():
    print(trainer.state.iteration, "restart_dataloader")
    trainer.state.dataloader = map(lambda x: x, data_loader)

trainer.run(map(lambda x: x, data_loader), epoch_length=5, max_epochs=3)
Read more comments on GitHub >

github_iconTop Results From Across the Web

torch.utils.data β€” PyTorch 1.13 documentation
At the heart of PyTorch data loading utility is the torch.utils.data.DataLoader class. It represents a Python iterable over a dataset, with support for....
Read more >
Trainer β€” transformers 3.1.0 documentation - Hugging Face
The API supports distributed training on multiple GPUs/TPUs, mixed precision through ... Dataset] = None, eval_dataset: Optional[torch.utils.data.dataset.
Read more >
A detailed example of data loaders with PyTorch
In this blog post, we are going to show you how to generate your data on multiple cores in real time and feed...
Read more >
TPS2210 Pytorch Lightning IterableDataset | Kaggle
By adjusting the batch_size of Pytorch, a lot of training can be done in one epoch, and training can be completed in only...
Read more >
Source code for fairseq.data.iterators
Compared to :class:`torch.utils.data.DataLoader`, this iterator: - can be reused across multiple epochs with the :func:`next_epoch_itr` method (optionallyΒ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found