Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[LightningLite] Gradient Clipping with AMP

See original GitHub issue

🚀 Feature

Currently, there doesn’t seem to be a way to do gradient clipping when using AMP with LightningLite. This is because the grad scalar code is behind the NativeMixedPrecisionPlugin, and there doesn’t seem to be any way of changing this for LightningLite.

Motivation

Grad clipping is very popular across DL, and having some way to do this is very important.

Pitch

I see that the NativeMixedPrecisionPlugin has functionality for doing this for LightningModules, but not for Lite models.

The method used for LightningModules (using configure_gradient_clipping) – if extended to work with Lite modules – also seems to not be in the spirit of Lite (but it’s better than nothing, I suppose).

If you enjoy Lightning, check out our other projects! ⚡

Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @borda @carmocca @justusschock @awaelchli @akihironitta @rohitgr7

Issue Analytics

State:
Created a year ago
Comments:5 (5 by maintainers)

Top GitHub Comments

2reactions

carmoccacommented, Jun 14, 2022

I wouldn’t introduce any hooks to Lite, but we could expose a clip_gradients function that always works the selected Lite configuration.

0reactions

awaelchlicommented, Jun 18, 2022

I think providing a Lite.clip_gradients() could work. This could even be useful for manual optimization from within the LightningModule.

Top Results From Across the Web

Effective Training Techniques - PyTorch Lightning

Gradient Clipping. Gradient clipping can be enabled to avoid exploding gradients. By default, this will clip the gradient norm by calling torch.nn.utils ...

CUDA Automatic Mixed Precision examples - PyTorch

amp.GradScaler help perform the steps of gradient scaling conveniently. Gradient scaling improves convergence for networks with float16 gradients by minimizing ...

PyTorch Lightning 1.5 Released - Exxact Corporation

LightningLite enables pure PyTorch users to scale their existing code to any kind ... Perform gradient clipping on gradients associated with ...

PyTorch Lightning - Managing Exploding Gradients ... - YouTube

PyTorch Lightning - Managing Exploding Gradients with Gradient Clipping ... we give a short intro to Lightning's flag 'gradient_clip_val.

Faster Deep Learning Training with PyTorch – a 2021 Guide

Use gradient/activation checkpointing. ... Use gradient clipping. ... In the best case, the usage of AMP would look something like this:.