question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to get test predictions (and other non-scalar metrics)?

See original GitHub issue

❓ Questions and Help

What is your question?

If I have a trained model and I want to test it using Trainer.test(), how do I get the actual predictions of the model on the test set?

I tried to log the predictions and writing a Callback to get the logs at test end, but it seems like I can only log scalar Tensors in the dictionary returned by my model’s test_end().

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

6reactions
awaelchlicommented, Mar 17, 2020

That’s not the same as logging, that was not clear in your original question. You will have to collect your predictions in test_step in a variable like self.predictions which is a list or something. Then after you call trainer.test() you can access model.predictions in your notebook. What do you think?

1reaction
hankyul2commented, Jan 12, 2022

I know this is a old question. But I think this question & answer could be a walk around.

import os

import torch
from torch.utils.data import DataLoader

from torchvision import models, transforms
from torchvision.datasets import CIFAR10

from pytorch_lightning import LightningModule, LightningDataModule, Trainer

os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'


class CIFAR(LightningDataModule):
    def __init__(self, img_size=32, batch_size=32):
        super().__init__()
        self.img_size = img_size if isinstance(img_size, tuple) else (img_size, img_size)
        self.batch_size = batch_size

        self.test_transforms = transforms.Compose([
            transforms.Resize(self.img_size),
            transforms.CenterCrop(self.img_size),
            transforms.ToTensor(),
            transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
        ])

    def prepare_data(self) -> None:
        CIFAR10(root='data', train=True, download=True)
        CIFAR10(root='data', train=False, download=True)
    
    def setup(self, stage=None):
        self.test_ds = CIFAR10(root='data', train=False, download=False, transform=self.test_transforms)

    def test_dataloader(self):
        return DataLoader(self.test_ds, num_workers=4, batch_size=self.batch_size, shuffle=False)


class BasicModule(LightningModule):
    def __init__(self):
        super().__init__()
        self.model = models.resnet18(num_classes=10, pretrained=False)

    def test_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self.model(x)
        return y, y_hat.argmax(dim=-1)

    def test_epoch_end(self, outputs):
        results = torch.zeros((10, 10)).to(self.device)
        for output in outputs:
            for label, prediction in zip(*output):
                results[int(label), int(prediction)] += 1
        torch.distributed.reduce(results, 0, torch.distributed.ReduceOp.SUM)
        acc = results.diag().sum() / results.sum()
        if self.trainer.is_global_zero:
            self.log("test_metric", acc, rank_zero_only=True)
            self.trainer.results = results
        
    
if __name__ == '__main__':
    data = CIFAR(batch_size=512)
    model = BasicModule()
    trainer = Trainer(max_epochs=2, gpus='0,1', strategy="ddp", precision=16)
    test_results = trainer.test(model, data)
    if trainer.is_global_zero:
        print(test_results)
        print(trainer.results)
Read more comments on GitHub >

github_iconTop Results From Across the Web

How to get test predictions (and other non-scalar metrics)?
If I have a trained model and I want to test it using Trainer.test() , how do I get the actual predictions of...
Read more >
Evaluate predictions - Hugging Face
To learn more about how to use metrics, take a look at the library Evaluate! In addition to metrics, you can find more...
Read more >
Going Beyond Scalar Metrics: Behavioral Testing of NLP Models
There are 4 different user perspectives that shape a behavioral test suite. ... DIR tests expect predictions to change in a certain way, ......
Read more >
A Comprehensive Guide on How to Monitor Your Models in ...
In terms of model predictions, the most important thing to monitor is model performance in line with business metrics. Model evaluation metrics.
Read more >
3.3. Metrics and scoring: quantifying the quality of predictions
There are 3 different APIs for evaluating the quality of a model's predictions: Estimator ... Use sklearn.metrics.get_scorer_names() to get valid options.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found