question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Allo @lucidrains , I’ve been fiddling with this optimizer, looking promising so far. I was looking for other interpretations out there for my doubts re no bias correction… I’m assuming it’s deemed unecessary due to the explicit m0 and v1 init, but wasn’t 100% sure it wasn’t just left out for clarity.

I noticed you left m0 as zero, and v1 as interpolation with zero init… did you experiment with that vs the notes in paper, Algorithm 1?

The core of my attempt below (note I flipped the betas to be comparable to adam/lamb/etc: .98, .92, .99)

    state = self.state[p]
    if len(state) == 0:
        state['step'] = 0
        state['grad'] = torch.zeros_like(grad)
        state['m'] = torch.clone(grad)  # init m0 = g0
        state['v'] = torch.zeros_like(grad)
        state['n'] = torch.zeros_like(grad)

    m, v, n = state['m'], state['v'], state['n']
    # NOTE first step is no-op as we need g0 & g1 for first grad delta (g1 - g0)
    if state['step'] > 0:
        m.lerp_(grad, 1. - beta1)
        grad_delta = grad - state['grad']
        if state['step'] > 1:
            v.lerp_(grad_delta, 1. - beta2)
        else:
            v.copy_(grad_delta)  # init v1 = g1 - g0
        n.lerp_((grad + beta2 * grad_delta).square(), 1. - beta3)

        # FIXME paper Algorithm 1 includes no bias correction
        # Does m0 and v1 init special cases obliviate the need or was left out of paper for clarity?
        denom = 1 + group['weight_decay'] * lr
        step_size = lr * (n + group['eps']).rsqrt()
        p.addcmul_(step_size, m.add(v, alpha=beta2), value=-1.).div_(denom)

    state['grad'].copy_(grad)
    state['step'] += 1

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:15 (6 by maintainers)

github_iconTop GitHub Comments

8reactions
XingyuXiecommented, Aug 29, 2022

Sorry for making something confused here. Adan indeed has the bias correction in the implementation, but we need to consist the algorithm presentation with the theoretical analysis. Hence, we did not explicitly emphasize it in Algorithm1. We’ll release the code in a few days (2-3 days since we have a code review procedure). The log and config files will release together. @rwightman

5reactions
XingyuXiecommented, Aug 30, 2022

@lucidrains Thanks for updating, the following are some minor modifications. When we implement Adan, we refer to some optimizer’s implementation in timm.

Line 55: state['prev_grad'] = grad Line 85-86:

correct_m = 1 / bias_correct1  # correction term for m'
correct_v = 1 / bias_correct2  # correction term for v

Line 91:

weighted_step_size = lr / ((n.sqrt()/sqrt_bias_correct3).add_(eps))

Tips:

  • For fairness and ease of use, we do not enable the restart condition in practice.
  • Adan can tolerate a large peak LR. For example, except for the experiments for the pre-training of MAE and LSTM, Adan’s LR is 5-10 times that of Adam/AdamW.
  • Adan seems to be relatively sensitive to beta3. Adjusting beta1 and beta2 has a limited effect on the results, especially beta2.
  • Interestingly, we found that weight_decay = 0.02 seems to be suitable for most experiments.
Read more comments on GitHub >

github_iconTop Results From Across the Web

CANBED M0 - Longan Docs
CANBed M0 is an upgraded version of CANBed V1. ... This function is used to initialize the baud rate of the CAN Bus...
Read more >
Firebeetle Board-M0 Wiki - DFRobot
Introduction. FireBeetle is a product series of small development board developed by DFRobot. It contains various chip boards and expansion boards that can ......
Read more >
Modem Initialization Strings - Cisco
This document provides tables that contain modem initialization strings and sample modem ... Init=AT&F&C1&D3\J0\M0\N7\V1\Q2%C1S7=60S0=1&W. Speed=38400.
Read more >
Modem Initialization Strings - The Vespiary
InitString =AT&FX4&C1&D3&M4\J0\N3\Q2\V1%C1S7=60 ... InitString=AT&F&C1&D3%C3%G0\J0-M0\N6\Q2\V2S7=60 ... InitString=AT&FW2&C1&D3&K3\J0\N3\Q3\V1%C1"H3S7=60
Read more >
core/cortex-m0 - chromiumos/platform/ec - Git at Google
chromium / chromiumos / platform / ec / v1.9.0 / . / core / cortex-m0 ... S · init.S · irq_handler.h · ldivmod.S...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found