Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Simple neq switching example has nans

See original GitHub issue

When I run neq switching (with the vanilla htf) for ALA -> THR (dipeptide) in solvent, I am seeing nans toward the end of forward switching. Note that the nans do not occur every time I run the experiment. If I run 3x, I see nans in 2/3 runs.

Also note that this only happens with master. If i run with perses=0.9.2 or 0.9.1, I do not see any nans.

All experiments run with openmm = 7.7.0 from conda-forge (not the nightly dev builds).

Code:

from perses.tests.test_topology_proposal import generate_atp, generate_dipeptide_top_pos_sys
import pickle

# Generate vanilla htf
atp, system_generator = generate_atp(phase='solvent')
htf = generate_dipeptide_top_pos_sys(atp.topology,
                                     'THR',
                                     atp.system,
                                     atp.positions,
                                     system_generator,
                                     conduct_htf_prop=True,
                                     validate_endstate_energy=False)

with open("atp_solvent_vanilla.pickle", "wb") as f:
    pickle.dump(htf, f)

import numpy as np
from openmmtools.integrators import PeriodicNonequilibriumIntegrator
from openmmtools.constants import kB
from simtk import unit, openmm
import argparse
import os
import time

# Define simulation parameters
nsteps_eq = 25000 # 100 ps
nsteps_neq = int(250000) # 1 ns
neq_splitting='V R H O R V'
timestep = 4.0 * unit.femtosecond
platform_name = 'CUDA'
temperature = 300.0 * unit.kelvin

# Define lambda functions
x = 'lambda'
ALCHEMICAL_FUNCTIONS = {
                             'lambda_sterics_core': x,
                             'lambda_electrostatics_core': x,
                             'lambda_sterics_insert': f"select(step({x} - 0.5), 1.0, 2.0 * {x})",
                             'lambda_sterics_delete': f"select(step({x} - 0.5), 2.0 * ({x} - 0.5), 0.0)",
                             'lambda_electrostatics_insert': f"select(step({x} - 0.5), 2.0 * ({x} - 0.5), 0.0)",
                             'lambda_electrostatics_delete': f"select(step({x} - 0.5), 1.0, 2.0 * {x})",
                             'lambda_bonds': x,
                             'lambda_angles': x,
                             'lambda_torsions': x}

# Read in vanilla htf
print("reading htf")
with open(os.path.join("atp_solvent_vanilla.pickle"), 'rb') as f:
    htf = pickle.load(f)
    
positions = htf.hybrid_positions
system = htf.hybrid_system

# Set up integrator
print('setting up integrator')
integrator = PeriodicNonequilibriumIntegrator(ALCHEMICAL_FUNCTIONS, nsteps_eq, nsteps_neq, neq_splitting, timestep=timestep, temperature=temperature)

# Set up context
print("setting up context")
platform = openmm.Platform.getPlatformByName(platform_name)
if platform_name in ['CUDA', 'OpenCL']:
    platform.setPropertyDefaultValue('Precision', 'mixed')
if platform_name in ['CUDA']:
    platform.setPropertyDefaultValue('DeterministicForces', 'true')
context = openmm.Context(system, integrator, platform)
context.setPeriodicBoxVectors(*htf.hybrid_system.getDefaultPeriodicBoxVectors())
context.setPositions(positions)
context.setVelocitiesToTemperature(temperature)

# Minimize
print("minimize")
openmm.LocalEnergyMinimizer.minimize(context)

# Run eq forward (0 -> 1)
print("running eq")
integrator.step(nsteps_eq)

# Run neq forward (0 -> 1)
print("running neq")
energies = []
positions_old = []
positions_new = []
for fwd_step in range(int(nsteps_neq / 2500)):
    integrator.step(2500)
    print(f"Forward neq: {fwd_step*2500} completed")

Issue Analytics

State:
Created 2 years ago
Comments:11 (11 by maintainers)

Top GitHub Comments

1reaction

ijpulidoscommented, Jan 20, 2022

I can confirm that backporting the average of atoms masses fixes the issues with the code in this thread.

Now, surprisingly, I’m using the same fix with my benchmark repex tyk2 sims and they seem to be more stable using a timestep of 4fs. They are still running (will need one more day or so to get conclusive results), but I think we can already merge the associated PR that fixes this issue. I will review that one, asap.

0reactions

dominicrufacommented, Jan 19, 2022

From devs meeting:

Does backporting the atom mass fix help? @zhang-ivy

For tyk2 inspect maps and system.xml between old and new perses versions. @dominicrufa

For the life of me, I can’t seem to figure out what the nanning problem is when i compare newly-generated tyk2 hybrid system xml and an older version (which successfully runs repex without nans). the system xml lives at /warm/chodera/brucemah/relative_paper2/Tyk2-ANI/lig14to8/xml/complex-hybrid-system.gz on lilac. the old and new systems for complex/solvent also live there.

the full pickled HybridTopologyFactory lives at /warm/chodera/brucemah/relative_paper2/Tyk2-ANI/lig14to8/outhybrid_factory.npy.npz