question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

replica exchange performance

See original GitHub issue

I’ve been working with the replica exchange module, trying to run standard temperature REMD. I set up an OpenMM script using the proposed format here: openmmtools–Missing feature for Replica Exchange ? I’m running on Titan, which is set up with single-GPU nodes with K80 cards and aprun for job submission. I have openmm, openmmtools, and yank all installed using conda.

I’m finding that the speed decays when switching from just openmm to a “single replica” yank setup, and then further when adding replicas. I sense that some of this is just different output file formats/needs, and then communication time between nodes/cards when adding replicas, but I’m wondering how much should be expected? The degree of slowdown makes me feel like I must have something not configured right, like maybe each replica isn’t being properly assigned to its own node/GPU. I’m submitting test jobs using e.g. aprun -n 2 -N 1 python yank_test.py for two replicas.

Thanks!

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
andrrizzicommented, Jul 19, 2018

If you’re doing temperature REMD (and you’re not doing this already), then you could also try to use ParallelTempering instead of the ReplicaExchange class. We haven’t used it much, but the computation of the MBAR energy matrix at each iteration should be faster.

Both the Gibbs sampling procedure and the MBAR energy matrix computation scale superlinearly w.r.t. the number of states, and the I/O operations scale more or less linearly so a worsened performance is to be expected, although I’m not sure about the actual numbers.

Also, I’d make sure you are not using GHMCMove, which is presented as an example in the snippet on that thread, unless you require exact sampling of the distribution.

0reactions
jchoderacommented, Aug 19, 2018

@jlincoff : Can you provide more information to help us debug this?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Replica Exchange Molecular Dynamics: A Practical ... - NCBI
Replica Exchange Molecular Dynamics: A Practical Application Protocol with Solutions to Common Problems and a Peptide Aggregation and ...
Read more >
Replica exchange with nonequilibrium switches - PNAS
We introduce a replica exchange (parallel tempering) method in which attempted configuration swaps are generated using nonequilibrium work simulations.
Read more >
LAMMPS Course for Intermediate Users: Replica exchange
Replica exchange, also known as parallel tempering, is an enhanced sampling technique that can be used on molecular simulations to improve sampling of...
Read more >
Replica Exchange - Amber Molecular Dynamics
Before starting a replica exchange simulation we need to determine the number of replicas ... If this is not done then performance can...
Read more >
Performance of Replica-exchange Wang-landau Sampling for ...
We report a brief performance study of the replica-exchange Wang-Landau algorithm, a recently proposed parallel realization of Wang-Landau sampling, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found