Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError: FP16_DeepSpeedZeroOptimizer is not an Optimizer

See original GitHub issue

I’m trying to use 1-Cycle scheduler, but I meet the following error :

TypeError: FP16_DeepSpeedZeroOptimizer is not an Optimizer

Here is my configuration file :

{
    "train_batch_size": 64,
    "train_micro_batch_size_per_gpu": 1,
    "gradient_accumulation_steps": 16,
    "optimizer": {
        "type": "Adam",
        "params": {
            "lr": 3e-05,
            "betas": [
                0.9,
                0.999
            ],
            "eps": 1e-8,
            "weight_decay": 0.01
        }
    },
    "gradient_clipping": 0.1,
    "scheduler": {
        "type": "OneCycle",
        "params": {
            "cycle_first_step_size": 16000,
            "cycle_first_stair_count": 8000,
            "decay_step_size": 16000,
            "cycle_min_lr": 1e-06,
            "cycle_max_lr": 3e-05,
            "decay_lr_rate": 1e-07,
            "cycle_min_mom": 0.85,
            "cycle_max_mom": 0.99,
            "decay_mom_rate": 0.0
        }
    },
    "zero_optimization": true,
    "disable_allgather": true,
    "fp16": {
        "enabled": true,
        "loss_scale": 0,
        "min_loss_scale": 1
    }
}

When using another Scheduler (with FP16), I meet no problem.

Issue Analytics

State:
Created 4 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

2reactions

jeffracommented, Mar 4, 2020

Thanks for reporting this bug. We will take a look at this as soon as possible. I just created two test cases that reproduce the error (one with ZeRO and one with FP16 but no ZeRO).

https://github.com/microsoft/DeepSpeed/blob/jeffra/onecycle_bug/tests/unit/test_fp16.py#L147-L246

1reaction

jeffracommented, Mar 13, 2020

Hi @Colanim, it should be up to date. Can you tell us this info from inside your docker container?

python -c 'import deepspeed; print("deepspeed info:", deepspeed.__version__, deepspeed.__git_branch__, deepspeed.__git_hash__)'

Also I just looked at the lasted docker build, it prints this same version info and it looks to be aligned with the latest March 12th commit (3d3f8d36a4e8c0b7e6358bccd90254fc7424ffcb): https://dev.azure.com/DeepSpeedMSFT/DeepSpeed/_build/results?buildId=416&view=logs&j=3dc8fd7e-4368-5a92-293e-d53cefc8c4b3&t=a1aa9649-a94b-5ac4-3f5e-9bb6223edb04&l=1717

** info: 0.1.0 master 3d3f8d3

Top Results From Across the Web

DeepSpeed Configuration JSON

A variant optimizer for 1-bit Adam is 0/1 Adam, which further optimizes 1-bit Adam ... enabled is a fp16 parameter indicating whether or...

pytorch_lightning.strategies.deepspeed - PyTorch Lightning

Currently only Adam is a DeepSpeed supported optimizer when using ZeRO. ... logging level for deepspeed. loss_scale: Loss scaling value for FP16 training....

Automatic Mixed Precision package - torch.amp - PyTorch

Backward passes under autocast are not recommended. ... Creates model and optimizer in default precision model = Net().cuda() optimizer = optim.

DeepSpeed Optimizer Problems - Stack Overflow

I keep having this trouble with the optimizer and I am not sure what ... /python3.8/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", ...

DeepSpeed Integration - Hugging Face

Optimizer state partitioning (ZeRO stage 1); Gradient partitioning (ZeRO stage 2) ... This may or may not match the GPUs on the target...