Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Do we need to set mixed-precision explicitly or is it handled if tensor cores available?

See original GitHub issue

I following your awesome guide on timm: https://towardsdatascience.com/getting-started-with-pytorch-image-models-timm-a-practitioners-guide-4e77b4bf9055.

I am running training on an A100-based VM which should support mixed-precision training. Does Trainer from PyTorch Accelerated take care of that automatically?

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

sayakpaulcommented, Mar 16, 2022

Thanks so much for being thorough with your explanations. And yes it makes sense to use config files to launch training whenever possible.

1reaction

Chris-hughes10commented, Mar 16, 2022

No problem. If you would like to enable mixed precision while launching with python main.py you can do so in two ways:

The easiest is to set the Environment Variable USE_FP16 as ‘true’ (or ‘True’, it doesn’t matter). This is what the launcher does for you based on the value in your config file.
Hard code the value in the Trainer so that it always uses mixed precision (this is not really recommended though)

If you want to go ahead with option 2, you will need to update the Trainer’s accelerator object, which is what handles moving data between devices. You can override the create_accelerator method to do this. Here is an example of how this would look:

from accelerate import Accelerator
from pytorch_accelerated import Trainer

class Fp16Trainer(Trainer):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def _create_accelerator(self):
        return Accelerator(fp16=True)

That seems a lot of effort for such a small change though, so it wasn’t really the intended approach for this!

Top Results From Across the Web

Train With Mixed Precision - NVIDIA Documentation Center

Mixed precision methods combine the use of different numerical formats in ... In other words, Tensor Core math can accumulate half precision ...

Mixed precision | TensorFlow Core

Overview. Mixed precision is the use of both 16-bit and 32-bit floating-point types in a model during training to make it run faster...

Understanding Tensor Cores - Paperspace Blog

As we detailed in this blog post, these cores enable high performance mixed precision training paradigms that have allowed the Volta, Turing, and...

Training Neural Networks with Tensor Cores - Dusan Stosic ...

ECCV 2020 Tutorial on Accelerating Computer Vision with Mixed PrecisionWebsite: ...

GETTING STARTED WITH TENSOR CORES IN HPC

Math Mode set with cublasSetMathMode function. Volta and Turing family Tensor Core can be used with in mixed precision (FP16 inputs, ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Do we need to set mixed-precision explicitly or is it handled if tensor cores available?

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

IE11 Compatability

TypeError: uuid.v4 is not a function