Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

precision parameter in Trainer does not accept "16" passed as str

See original GitHub issue

Bug description

When you try to create Trainer with wrong value for precision it provides correct values as:

MisconfigurationException("Precision '<wrong value>' is invalid. Allowed precision values: ('16', '32', '64', 'bf16')")

This indicate that the values are supposed to be string. However, later when I pass "16" as a value. It fails with:

RuntimeError('No precision set')

Only value 16 passed as an integer works right now. It causes problems with some configuration frameworks or hyperparam optimization libraries. They often do not work with unions.

How to reproduce the bug

import pytorch_lightning as pl
pl.Trainer(precision="16")

Error messages and logs

File ~/myenv/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:226, in AcceleratorConnector.__init__(self, devices, num_nodes, accelerator, strategy, plugins, precision, amp_type, amp_level, sync_batchnorm, benchmark, replace_sampler_ddp, deterministic, auto_select_gpus, num_processes, tpu_cores, ipus, gpus)
    223 self._init_strategy()
    225 # 5. Instantiate Precision Plugin
--> 226 self.precision_plugin = self._check_and_init_precision()
    228 # 6. Instantiate Strategy - Part 2
    229 self._lazy_init_strategy()

File ~/myenv/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:735, in AcceleratorConnector._check_and_init_precision(self)
    732         self._amp_level_flag = self._amp_level_flag or "O2"
    733         return ApexMixedPrecisionPlugin(self._amp_level_flag)
--> 735 raise RuntimeError("No precision set")

RuntimeError: No precision set

Environment

CUDA:
- GPU: None
- available: False
- version: None
Lightning:
- lightning-utilities: 0.3.0
- pytorch-lightning: 1.8.3.post1
- torch: 1.13.0
- torchaudio: 0.13.0
- torchmetrics: 0.10.3
- torchvision: 0.14.0
System:
- OS: Darwin
- architecture:
  - 64bit
- processor: arm
- python: 3.9.15
- version: Darwin Kernel Version 21.6.0

More info

Also checked on newest pytorch_lightning pytorch_lightning-1.8.4.post0.

Works fine for following snippet:

import pytorch_lightning as pl
pl.Trainer(precision=16)

cc @borda @carmocca @justusschock @awaelchli

Issue Analytics

State:
Created 9 months ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

pkubikcommented, Dec 13, 2022

What library are you working with?

We actually tried to use this with hydra with schema defined in a dataclass. We probably could find some nasty workaround.

I do not claim that my case is sufficient justification to make a decision here 😃. I just saw this inconsistency with the error message and thought that accepting str was expected here.

0reactions

carmoccacommented, Dec 13, 2022