Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

training speed is about 2x slower than JAX trainable version (Uni-Fold)

See original GitHub issue

device: 1 A100 with 40GB memory cuda: 11.3 Compared with https://github.com/dptech-corp/Uni-Fold, using model_2 setting, and the same data (only use one sample, and use DummyDataLoader in openfold).

And I follow this issue, https://github.com/aqlaboratory/openfold/issues/19, disabled clear_cache_between_blocks and deepspeed for cpu offload. The commit I used is https://github.com/aqlaboratory/openfold/commit/c4d9f57f9005f3e9e0325eff97b8232e328b4813

speed per example:

	FP32	FP16
openfold	24.5 s	17 s
Uni-Fold	13.25 s	8.9 s

Is that expected? any tricks that I can get further speed-up?

Issue Analytics

State:
Created 2 years ago
Comments:43 (8 by maintainers)

Top GitHub Comments

1reaction

gahdritzcommented, Dec 29, 2021

BTW @guolinke the recycling number bug is now fixed. The fix requires a little bit of extra data processing, and so it comes with a performance penalty of about half a second. I’m trying to think of ways to improve it.

1reaction

gahdritzcommented, Dec 24, 2021

@lhatsk would you mind moving this bfloat16 stuff into a new issue?

Top Results From Across the Web

Training duration & NaNs during training · Issue #19 - GitHub

I'm wondering what training times I can expect for a single target. ... training speed is about 2x slower than JAX trainable version...

Need for Speed: JAX. Training your neural network ten times…

Training your neural network ten times faster using Jax on a TPU. All the cool kids seem to be raving about JAX these...

System Optimizations Enable Training Deep Learning Models ...

DeepSpeed carefully designs a 3D parallelism strategy to train a massive-scale language model with 17 billion parameters [21] .

Training a subset of parameters - Haiku Documentation

Training a subset of parameters#. Sometimes when training a neural network it is useful to hold some parameters of your network fixed while...

@graphml - تمام پست های تلگرام کانال Graph Machine Learning

The denoising model is the same equivariant EGNN. Interestingly, DiffLinked has an additional module to predict the linker size (number of molecules) so...