Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use two separate dataloaders

See original GitHub issue

❓ Questions and Help

Before asking:

search the issues.
search the docs.

Here is the relevant issue that I have looked up.

What is your question?

I have a labeled and an unlabeled dataset that I am using for a semi-supervised segmentation problem.

Here is what I want to implement in the training loop:

Train an epoch on the labeled dataset. (alternatively, a batch)
Train an epoch on the unlabeled dataset. (alternatively, a batch)

What is the best way to do it? I modified the code in the issue listed here.

Code

Please see below.

What have you tried?

    def train_dataloader(self):
        labeled_dataloader = DataLoader(self.labeled_dataset,
                                        batch_size=self.batch_size,
                                        num_workers=4,
                                        shuffle=True, drop_last=True)

        unlabeled_dataloader = DataLoader(self.unlabeled_dataset,
                                          batch_size=self.batch_size,
                                          num_workers=4,
                                          shuffle=True, drop_last=True)
        
        return {'labeled':labeled_dataloader, 'unlabeled': unlabeled_dataloader}

What’s your environment?

OS: [e.g. iOS, Linux, Win]
Packaging [e.g. pip, conda]
Version [e.g. 0.5.2.1]

Issue Analytics

State:
Created 3 years ago
Comments:11 (7 by maintainers)

Top GitHub Comments

1reaction

nazim1021commented, Sep 6, 2020

Hi @awaelchli thanks a lot for this solution, I have a similar problem where i am trying to use unlabeled dataset for semi-supervised classifcation, however i m having trouble on how can we add up the 2 losses and return them as one after each epoch. ? since here the losses are being calculated alternatively

0reactions

awaelchlicommented, Jul 15, 2021

I created an issue for this here: #8435

Top Results From Across the Web

Two DataLoaders from two different datasets within the same ...

So I am trying to have two data loaders emit a batch of data each within the training loop. Like so: data_loader1 =...

How to use two seperate dataloaders together? - Stack Overflow

So I want to know, how should I go about this, divide them smaller batches using slicing or should I use two separate...

How to use multiple train dataloaders with different lengths

I'm training with a strategy of alternate batches of 2 datasets. I.e., 1 batch of images from dataset A only, then a batch...

Using multiple dataloaders in the training_step? #2457 - GitHub

For training, the best way to use multiple-dataloaders is to create a Dataloader class which wraps both your dataloaders.

Managing Data — PyTorch Lightning 1.8.5.post0 documentation

In the training loop, you can pass multiple DataLoaders as a dict or list/tuple, and Lightning will automatically combine the batches from different...