question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError: FP16_DeepSpeedZeroOptimizer is not an Optimizer

See original GitHub issue

I’m trying to use 1-Cycle scheduler, but I meet the following error :

TypeError: FP16_DeepSpeedZeroOptimizer is not an Optimizer


Here is my configuration file :

{
    "train_batch_size": 64,
    "train_micro_batch_size_per_gpu": 1,
    "gradient_accumulation_steps": 16,
    "optimizer": {
        "type": "Adam",
        "params": {
            "lr": 3e-05,
            "betas": [
                0.9,
                0.999
            ],
            "eps": 1e-8,
            "weight_decay": 0.01
        }
    },
    "gradient_clipping": 0.1,
    "scheduler": {
        "type": "OneCycle",
        "params": {
            "cycle_first_step_size": 16000,
            "cycle_first_stair_count": 8000,
            "decay_step_size": 16000,
            "cycle_min_lr": 1e-06,
            "cycle_max_lr": 3e-05,
            "decay_lr_rate": 1e-07,
            "cycle_min_mom": 0.85,
            "cycle_max_mom": 0.99,
            "decay_mom_rate": 0.0
        }
    },
    "zero_optimization": true,
    "disable_allgather": true,
    "fp16": {
        "enabled": true,
        "loss_scale": 0,
        "min_loss_scale": 1
    }
}

When using another Scheduler (with FP16), I meet no problem.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
jeffracommented, Mar 4, 2020

Thanks for reporting this bug. We will take a look at this as soon as possible. I just created two test cases that reproduce the error (one with ZeRO and one with FP16 but no ZeRO).

https://github.com/microsoft/DeepSpeed/blob/jeffra/onecycle_bug/tests/unit/test_fp16.py#L147-L246

1reaction
jeffracommented, Mar 13, 2020

Hi @Colanim, it should be up to date. Can you tell us this info from inside your docker container?

python -c 'import deepspeed; print("deepspeed info:", deepspeed.__version__, deepspeed.__git_branch__, deepspeed.__git_hash__)'

Also I just looked at the lasted docker build, it prints this same version info and it looks to be aligned with the latest March 12th commit (3d3f8d36a4e8c0b7e6358bccd90254fc7424ffcb): https://dev.azure.com/DeepSpeedMSFT/DeepSpeed/_build/results?buildId=416&view=logs&j=3dc8fd7e-4368-5a92-293e-d53cefc8c4b3&t=a1aa9649-a94b-5ac4-3f5e-9bb6223edb04&l=1717

** info: 0.1.0 master 3d3f8d3

Read more comments on GitHub >

github_iconTop Results From Across the Web

DeepSpeed Configuration JSON
A variant optimizer for 1-bit Adam is 0/1 Adam, which further optimizes 1-bit Adam ... enabled is a fp16 parameter indicating whether or...
Read more >
pytorch_lightning.strategies.deepspeed - PyTorch Lightning
Currently only Adam is a DeepSpeed supported optimizer when using ZeRO. ... logging level for deepspeed. loss_scale: Loss scaling value for FP16 training....
Read more >
Automatic Mixed Precision package - torch.amp - PyTorch
Backward passes under autocast are not recommended. ... Creates model and optimizer in default precision model = Net().cuda() optimizer = optim.
Read more >
DeepSpeed Optimizer Problems - Stack Overflow
I keep having this trouble with the optimizer and I am not sure what ... /python3.8/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", ...
Read more >
DeepSpeed Integration - Hugging Face
Optimizer state partitioning (ZeRO stage 1); Gradient partitioning (ZeRO stage 2) ... This may or may not match the GPUs on the target...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found