question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ValueError: mutable default <class 'torch.distributed._shard.... when import lightning

See original GitHub issue

Bug description

Error

When I try to import lightning, I get such an error and have totally no clue how to fix it. It is very pleased if anyone could give me a hand! Thank you :)

How to reproduce the bug

No response

Error messages and logs

Python 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:18:27) [GCC 10.4.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> import pytorch_lightning
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/__init__.py", line 35, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/callbacks/__init__.py", line 28, in <module>
    from pytorch_lightning.callbacks.pruning import ModelPruning
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/callbacks/pruning.py", line 31, in <module>
    from pytorch_lightning.core.module import LightningModule
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/core/__init__.py", line 16, in <module>
    from pytorch_lightning.core.module import LightningModule
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/core/module.py", line 47, in <module>
    from pytorch_lightning.trainer.connectors.logger_connector.fx_validator import _FxValidator
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/trainer/__init__.py", line 17, in <module>
    from pytorch_lightning.trainer.trainer import Trainer
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/trainer/trainer.py", line 57, in <module>
    from pytorch_lightning.loops import PredictionLoop, TrainingEpochLoop
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/__init__.py", line 15, in <module>
    from pytorch_lightning.loops.batch import TrainingBatchLoop  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/batch/__init__.py", line 15, in <module>
    from pytorch_lightning.loops.batch.training_batch_loop import TrainingBatchLoop  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 20, in <module>
    from pytorch_lightning.loops.optimization.manual_loop import _OUTPUTS_TYPE as _MANUAL_LOOP_OUTPUTS_TYPE
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/optimization/__init__.py", line 15, in <module>
    from pytorch_lightning.loops.optimization.manual_loop import ManualOptimization  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/optimization/manual_loop.py", line 23, in <module>
    from pytorch_lightning.loops.utilities import _build_training_step_kwargs, _extract_hiddens
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/loops/utilities.py", line 29, in <module>
    from pytorch_lightning.strategies.parallel import ParallelStrategy
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/strategies/__init__.py", line 15, in <module>
    from pytorch_lightning.strategies.bagua import BaguaStrategy  # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/strategies/bagua.py", line 29, in <module>
    from pytorch_lightning.plugins.precision import PrecisionPlugin
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/plugins/__init__.py", line 7, in <module>
    from pytorch_lightning.plugins.precision.apex_amp import ApexMixedPrecisionPlugin
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/plugins/precision/__init__.py", line 18, in <module>
    from pytorch_lightning.plugins.precision.fsdp_native_native_amp import FullyShardedNativeNativeMixedPrecisionPlugin
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/pytorch_lightning/plugins/precision/fsdp_native_native_amp.py", line 24, in <module>
    from torch.distributed.fsdp.fully_sharded_data_parallel import MixedPrecision
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/fsdp/__init__.py", line 1, in <module>
    from .flat_param import FlatParameter
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/fsdp/flat_param.py", line 26, in <module>
    from ._fsdp_extensions import _ext_post_unflatten_transform, _ext_pre_flatten_transform
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/fsdp/_fsdp_extensions.py", line 7, in <module>
    from torch.distributed.fsdp._shard_utils import _create_chunk_sharded_tensor
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/fsdp/_shard_utils.py", line 10, in <module>
    from torch.distributed._shard.sharded_tensor import (
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/__init__.py", line 1, in <module>
    from .api import (
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/api.py", line 6, in <module>
    from torch.distributed._shard.sharded_tensor import (
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py", line 8, in <module>
    import torch.distributed._shard.sharding_spec as shard_spec
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/sharding_spec/__init__.py", line 1, in <module>
    from .api import (
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/sharding_spec/api.py", line 16, in <module>
    import torch.distributed._shard.sharded_tensor.metadata as sharded_tensor_meta
  File "/home/be/.conda/envs/torch/lib/python3.11/site-packages/torch/distributed/_shard/sharded_tensor/metadata.py", line 70, in <module>
    @dataclass
     ^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/dataclasses.py", line 1221, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/dataclasses.py", line 1211, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/dataclasses.py", line 959, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/be/.conda/envs/torch/lib/python3.11/dataclasses.py", line 816, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'torch.distributed._shard.sharded_tensor.metadata.TensorProperties'> for field tensor_properties is not allowed: use default_factory

Environment

Enviroment

python 3.11.0 torch 1.13.0+cu116 pytorch-lightning 1.8.0.post1 transformers 4.24.0

CUDA version

Cuda compilation tools, release 11.6, V11.6.55 Build cuda_11.6.r11.6/compiler.30794723_0

Driver version

±----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------±---------------------±---------------------+ And I have double RTX3090 on my own server.

More info

No response

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Reactions:2
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Erotemiccommented, Dec 3, 2022

I think this issue can be fixed in torch with a simple monkeypatch in torch/distributed/_shard/sharded_tensor/metadata.py

@@ -79,4 +79,4 @@
     # Size of each dim of the overall Tensor.
     size: torch.Size = field(default=torch.Size([]))
 
-    tensor_properties: TensorProperties = field(default=TensorProperties())
+    tensor_properties: TensorProperties = field(default_factory=TensorProperties())

I don’t entirely understand if this has any undesirable consequences, I’m not a huge dataclass user, but it does prevent the exception from happening.

I’d love to get my projects on 3.11 soon, so hopefully torch fixes this issue.

1reaction
awaelchlicommented, Nov 10, 2022

@BomanNg This looks like it is because you are running with Python 3.11. Please note, PyTorch does NOT yet officially support 3.11 (https://github.com/pytorch/pytorch/issues/86566). You have two options:

  1. You don’t really need Python 3.11 -> Use Python 3.10 instead
  2. You need Python 3.11 -> Install PyTorch nightly release: https://pytorch.org/ (click on the “Preview (nightly)” in the installation table)

Let me know if that helps you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to import pytorch_lightning on google colab
There appears to be a bug that has not hit pip yet with pytorch lightning not referencing the newest torchtext. enter image description...
Read more >
Distributed RPC Framework — PyTorch 1.13 documentation
By default, for all agents, it sets the default timeout to 60 seconds and performs ... On worker 0: >>> import torch >>>...
Read more >
See raw diff - Hugging Face
diff --git a/spm-default-16k.vocab b/spm-default-16k.vocab new file mode 100644--- /dev/null +++ b/spm-default-16k.vocab @@ -0,0 +1,16000 @@ +<pad> 0 +<unk> ...
Read more >
MIT was we will home can us about if page my has no
... despite guaranteed libraries turkey distributed proper degrees singapore ... ce mathematics aol compensation export managers modules aircraft sweden ...
Read more >
IP4DS: Day 1 - Getting Started With Python - Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found