Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add testing with PyTorch 1.11 on GPUs in CI

See original GitHub issue

🚀 Feature

We’ve decided to have testing with both PyTorch LTS and stable release (1.8 and 1.11 as of now) in CI, and we’ve already seen some issues arose while trying to enable it in #12373.

TODO

Known issues with PL with PyTorch 1.11

#12846
#12860
Fix an issue with fitting a model initialised in init_meta_context #12870
Fix an issue with DDP comm tests with some newer PyTorch versions #12878
Fix an issue with inference mode with FSDP

Motivation

To test new features, e.g. meta init and native FSDP, in CI that are only available in newer PyTorch versions.

Pitch

Use the following image:

pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.11

Alternatives

n/a

Additional context

n/a

If you enjoy Lightning, check out our other projects! ⚡

Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @carmocca @akihironitta @borda

Issue Analytics

State:
Created a year ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

carmoccacommented, May 6, 2022

1.11 is fine (already released)

We removed nightly testing because it was too flaky, making everybody ignore the job. We only enable it when there’s a release candidate upstream

1reaction

Bordacommented, May 6, 2022

Would it be an option to have PyTorch 1.12 (nightly) testing, too? For example, #12985 needs 1.12 for adapting FSDP native.

do you mean on CPU or also on GPU? tbh, not sure or don’t remember why we have dropped it so I am very fine to add it for CPU… cc: @carmocca

Top Results From Across the Web

test_jit_cuda_fuser fails on non-CUDA node for 1.11.0rc3

I've build PyTorch with CUDA support, but am cross-compiling on a node that does not have GPUs. I've hit a failing test there:....

Reproducibility — PyTorch 1.13 documentation

Reproducibility. Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms.

torch.cuda — PyTorch 1.13 documentation

This package adds support for CUDA tensor types, that implement the same function as CPU tensors, but they utilize GPUs for computation.

CUDA semantics — PyTorch 1.13 documentation

CUDA semantics. torch.cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU, and all CUDA...

Multiprocessing best practices — PyTorch 1.13 documentation

torch.multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors ......