question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

precision parameter in Trainer does not accept "16" passed as str

See original GitHub issue

Bug description

When you try to create Trainer with wrong value for precision it provides correct values as:

MisconfigurationException("Precision '<wrong value>' is invalid. Allowed precision values: ('16', '32', '64', 'bf16')")

This indicate that the values are supposed to be string. However, later when I pass "16" as a value. It fails with:

RuntimeError('No precision set')

Only value 16 passed as an integer works right now. It causes problems with some configuration frameworks or hyperparam optimization libraries. They often do not work with unions.

How to reproduce the bug

import pytorch_lightning as pl
pl.Trainer(precision="16")

Error messages and logs

File ~/myenv/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:226, in AcceleratorConnector.__init__(self, devices, num_nodes, accelerator, strategy, plugins, precision, amp_type, amp_level, sync_batchnorm, benchmark, replace_sampler_ddp, deterministic, auto_select_gpus, num_processes, tpu_cores, ipus, gpus)
    223 self._init_strategy()
    225 # 5. Instantiate Precision Plugin
--> 226 self.precision_plugin = self._check_and_init_precision()
    228 # 6. Instantiate Strategy - Part 2
    229 self._lazy_init_strategy()

File ~/myenv/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:735, in AcceleratorConnector._check_and_init_precision(self)
    732         self._amp_level_flag = self._amp_level_flag or "O2"
    733         return ApexMixedPrecisionPlugin(self._amp_level_flag)
--> 735 raise RuntimeError("No precision set")

RuntimeError: No precision set

Environment

  • CUDA:
    • GPU: None
    • available: False
    • version: None
  • Lightning:
    • lightning-utilities: 0.3.0
    • pytorch-lightning: 1.8.3.post1
    • torch: 1.13.0
    • torchaudio: 0.13.0
    • torchmetrics: 0.10.3
    • torchvision: 0.14.0
  • System:
    • OS: Darwin
    • architecture:
      • 64bit
    • processor: arm
    • python: 3.9.15
    • version: Darwin Kernel Version 21.6.0

More info

Also checked on newest pytorch_lightning pytorch_lightning-1.8.4.post0.

Works fine for following snippet:

import pytorch_lightning as pl
pl.Trainer(precision=16)

cc @borda @carmocca @justusschock @awaelchli

Issue Analytics

  • State:open
  • Created 9 months ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
pkubikcommented, Dec 13, 2022

What library are you working with?

We actually tried to use this with hydra with schema defined in a dataclass. We probably could find some nasty workaround.

I do not claim that my case is sufficient justification to make a decision here 😃. I just saw this inconsistency with the error message and thought that accepting str was expected here.

0reactions
carmoccacommented, Dec 13, 2022

We can also consider adding the string values as accepted types. In fact, we wanted to do this in Lightning Lite already. cc @carmocca

Yes. A contribution is welcome!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Trainer — PyTorch Lightning 1.8.5.post0 documentation
Passing training strategies (e.g., "ddp" ) to accelerator has been deprecated in v1.5.0 and will be removed in v1.7.0. Please use the strategy...
Read more >
Trainer - Hugging Face
Trainer. The Trainer class provides an API for feature-complete training in PyTorch for most standard use cases. It's used in most of the...
Read more >
Trainer - Composer
The trainer supports models with ComposerModel instances. The Trainer is highly customizable and can support a wide variety of workloads.
Read more >
Search Algorithms (tune.search) — Ray 2.2.0
Tune will automatically convert search spaces passed to Tuner to the library ... If this is set, do not pass any values to...
Read more >
Distributed RPC Framework — PyTorch 1.13 documentation
Not all features of the RPC package are yet compatible with CUDA support and ... case that a driver process is setting up...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found