question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Simple neq switching example has nans

See original GitHub issue

When I run neq switching (with the vanilla htf) for ALA -> THR (dipeptide) in solvent, I am seeing nans toward the end of forward switching. Note that the nans do not occur every time I run the experiment. If I run 3x, I see nans in 2/3 runs.

Also note that this only happens with master. If i run with perses=0.9.2 or 0.9.1, I do not see any nans.

All experiments run with openmm = 7.7.0 from conda-forge (not the nightly dev builds).

Code:

from perses.tests.test_topology_proposal import generate_atp, generate_dipeptide_top_pos_sys
import pickle

# Generate vanilla htf
atp, system_generator = generate_atp(phase='solvent')
htf = generate_dipeptide_top_pos_sys(atp.topology,
                                     'THR',
                                     atp.system,
                                     atp.positions,
                                     system_generator,
                                     conduct_htf_prop=True,
                                     validate_endstate_energy=False)

with open("atp_solvent_vanilla.pickle", "wb") as f:
    pickle.dump(htf, f)

import numpy as np
from openmmtools.integrators import PeriodicNonequilibriumIntegrator
from openmmtools.constants import kB
from simtk import unit, openmm
import argparse
import os
import time

# Define simulation parameters
nsteps_eq = 25000 # 100 ps
nsteps_neq = int(250000) # 1 ns
neq_splitting='V R H O R V'
timestep = 4.0 * unit.femtosecond
platform_name = 'CUDA'
temperature = 300.0 * unit.kelvin

# Define lambda functions
x = 'lambda'
ALCHEMICAL_FUNCTIONS = {
                             'lambda_sterics_core': x,
                             'lambda_electrostatics_core': x,
                             'lambda_sterics_insert': f"select(step({x} - 0.5), 1.0, 2.0 * {x})",
                             'lambda_sterics_delete': f"select(step({x} - 0.5), 2.0 * ({x} - 0.5), 0.0)",
                             'lambda_electrostatics_insert': f"select(step({x} - 0.5), 2.0 * ({x} - 0.5), 0.0)",
                             'lambda_electrostatics_delete': f"select(step({x} - 0.5), 1.0, 2.0 * {x})",
                             'lambda_bonds': x,
                             'lambda_angles': x,
                             'lambda_torsions': x}

# Read in vanilla htf
print("reading htf")
with open(os.path.join("atp_solvent_vanilla.pickle"), 'rb') as f:
    htf = pickle.load(f)
    
positions = htf.hybrid_positions
system = htf.hybrid_system

# Set up integrator
print('setting up integrator')
integrator = PeriodicNonequilibriumIntegrator(ALCHEMICAL_FUNCTIONS, nsteps_eq, nsteps_neq, neq_splitting, timestep=timestep, temperature=temperature)

# Set up context
print("setting up context")
platform = openmm.Platform.getPlatformByName(platform_name)
if platform_name in ['CUDA', 'OpenCL']:
    platform.setPropertyDefaultValue('Precision', 'mixed')
if platform_name in ['CUDA']:
    platform.setPropertyDefaultValue('DeterministicForces', 'true')
context = openmm.Context(system, integrator, platform)
context.setPeriodicBoxVectors(*htf.hybrid_system.getDefaultPeriodicBoxVectors())
context.setPositions(positions)
context.setVelocitiesToTemperature(temperature)

# Minimize
print("minimize")
openmm.LocalEnergyMinimizer.minimize(context)

# Run eq forward (0 -> 1)
print("running eq")
integrator.step(nsteps_eq)

# Run neq forward (0 -> 1)
print("running neq")
energies = []
positions_old = []
positions_new = []
for fwd_step in range(int(nsteps_neq / 2500)):
    integrator.step(2500)
    print(f"Forward neq: {fwd_step*2500} completed")

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
ijpulidoscommented, Jan 20, 2022

I can confirm that backporting the average of atoms masses fixes the issues with the code in this thread.

Now, surprisingly, I’m using the same fix with my benchmark repex tyk2 sims and they seem to be more stable using a timestep of 4fs. They are still running (will need one more day or so to get conclusive results), but I think we can already merge the associated PR that fixes this issue. I will review that one, asap.

0reactions
dominicrufacommented, Jan 19, 2022

From devs meeting:

  • Does backporting the atom mass fix help? @zhang-ivy
  • For tyk2 inspect maps and system.xml between old and new perses versions. @dominicrufa

For the life of me, I can’t seem to figure out what the nanning problem is when i compare newly-generated tyk2 hybrid system xml and an older version (which successfully runs repex without nans). the system xml lives at /warm/chodera/brucemah/relative_paper2/Tyk2-ANI/lig14to8/xml/complex-hybrid-system.gz on lilac. the old and new systems for complex/solvent also live there.

the full pickled HybridTopologyFactory lives at /warm/chodera/brucemah/relative_paper2/Tyk2-ANI/lig14to8/outhybrid_factory.npy.npz

Read more comments on GitHub >

github_iconTop Results From Across the Web

29. Dealing with NaN | Numerical Programming
We will create a temperature DataFrame, in which some data is not defined, i.e. NaN. We will randomly assign some NaN values into...
Read more >
What's the best way to handle NaN values?
This algorithm of imputation is very similar to KNearesNeighbours from sklearn. It finds the closest k samples from dataset to the sample with ......
Read more >
NaN loss when training regression network - Stack Overflow
I was getting the loss as nan in the very first epoch, as soon as the training starts. Solution as simple as removing...
Read more >
is.nan Function, Count, Replace & Remove - Statistics Globe
How to handle NaN values in R - Example syntax - is.nan & is not nan - Replace, remove & count NaN values...
Read more >
Equals method behavior change for NaN - .NET
NET 7 breaking change in core .NET libraries where the behavior of some Equals(T other) instance methods changed for NaN values.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found