question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add testing with PyTorch 1.11 on GPUs in CI

See original GitHub issue

🚀 Feature

We’ve decided to have testing with both PyTorch LTS and stable release (1.8 and 1.11 as of now) in CI, and we’ve already seen some issues arose while trying to enable it in #12373.

TODO

Known issues with PL with PyTorch 1.11

  • #12846
  • #12860
  • Fix an issue with fitting a model initialised in init_meta_context #12870
  • Fix an issue with DDP comm tests with some newer PyTorch versions #12878
  • Fix an issue with inference mode with FSDP

Motivation

To test new features, e.g. meta init and native FSDP, in CI that are only available in newer PyTorch versions.

Pitch

Use the following image:

pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.11

Alternatives

n/a

Additional context

n/a


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @carmocca @akihironitta @borda

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
carmoccacommented, May 6, 2022

1.11 is fine (already released)

We removed nightly testing because it was too flaky, making everybody ignore the job. We only enable it when there’s a release candidate upstream

1reaction
Bordacommented, May 6, 2022

Would it be an option to have PyTorch 1.12 (nightly) testing, too? For example, #12985 needs 1.12 for adapting FSDP native.

do you mean on CPU or also on GPU? tbh, not sure or don’t remember why we have dropped it so I am very fine to add it for CPU… cc: @carmocca

Read more comments on GitHub >

github_iconTop Results From Across the Web

test_jit_cuda_fuser fails on non-CUDA node for 1.11.0rc3
I've build PyTorch with CUDA support, but am cross-compiling on a node that does not have GPUs. I've hit a failing test there:....
Read more >
Reproducibility — PyTorch 1.13 documentation
Reproducibility. Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms.
Read more >
torch.cuda — PyTorch 1.13 documentation
This package adds support for CUDA tensor types, that implement the same function as CPU tensors, but they utilize GPUs for computation.
Read more >
CUDA semantics — PyTorch 1.13 documentation
CUDA semantics. torch.cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU, and all CUDA...
Read more >
Multiprocessing best practices — PyTorch 1.13 documentation
torch.multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found