question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Question: Can `DeepSpeedCPUAdam` be used as a drop in replacement to `torch.optim.Adam`?

See original GitHub issue

Hi,

I want to use the DeepSpeedCPUAdam instead of torch.optim.Adam for reducing the RAM usage of my GPUs while training. I was wondering if DeepSpeedCPUAdam can be just dropped in instead of torch.optim.Adam or additional steps are needed? I tried to do exactly that and I got a segmentation fault

Thanks

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
peterukkcommented, Aug 5, 2021

@tjruwase thanks, I opened a new issue

0reactions
tjruwasecommented, Aug 5, 2021

@peterukk, DeepSpeedCPUAdam will not work without CUDA (in theory it could). The reason is that DeepSpeedCPUAdam has a mode of execution where it also copies the updated parameters back to GPU using CUDA kernels. Do you have scenario where you want to use DeepSpeedCPUAdam outside CUDA environment? Depending on your answer can you please open a Question or reopen this one as appropriate? Thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

torch.optim — PyTorch 1.13 documentation
To use torch.optim you have to construct an optimizer object, that will hold the ... Implements lazy version of Adam algorithm suitable for...
Read more >
Optimizers — DeepSpeed 0.8.0 documentation - Read the Docs
Optimizers¶. DeepSpeed offers high-performance implementations of Adam optimizer on CPU; FusedAdam , FusedLamb , OnebitAdam , OnebitLamb optimizers on GPU.
Read more >
How to use Adam optim considering of its adaptive learning ...
Using pytorch. I have tried to initialize the optim after each epoch if I use batch train, and I do nothing when the...
Read more >
How to optimize a function using Adam in pytorch - ProjectPro
For optimizing a function we are going to use torch.optim which is a package, implements numerous optimization algorithms.
Read more >
Optimization — transformers 3.1.0 documentation
Implements Adam algorithm with weight decay fix as introduced in ... AdaFactor pytorch implementation can be used as a drop in replacement for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found