Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Amber parser should be improved

See original GitHub issue

Hello everyone, I think the Amber_parser needs some adjustment, other than issue #221 which was causing misreading of time information for both u_nk and dHdl parsers. Here I report the fixes that I think should be made, and I think the best thing should be a complete rewrite of the parser, properly tested:

Temperature is required in input for both parsers (extract_dHdl and extract_u_nk), but system temperature is already automatically read from the output file (reading temp0). We could make the T variable optional in the parsers, and I suggest not removing it, not to break existing scripts from users. We should check that the requested temperature T is the same as reported by the AMBER output file and report eventual inconsistencies to the user.
related to point 1), currently if temp0 is not found we log that ‘Non-constant temperature MD not currently supported.’ I don’t think that this is right, as temp0 is settled even in non-constant temperature MD, so I think that the check is meaningless. I agree that we could warn the users if the simulation was not performed at a constant T, but that’s not so easy to check, and I don’t think a similar check exists for other parsers (the user should know the simulation have to be run in NPT/NVT ensemble).
AMBER output file is one only with all the information for TI and MBAR reported every ntpr timesteps. Currently, I see two problems here: 3a. If the user wants to extract both dHdl and u_nk values, the output file is completely parsed twice, which can waste quite some time. 3b. If the user only wants dHdl values, and the file doesn’t have proper MBAR information, the user is needlessly warned that MBAR values are not read. I think we should refactor the code to have a single run through the AMBER output file, extracting just what’s needed by the user in one go. I imagine we could have the main function called extract_dHdl_and_u_nk, which defaults to returns both dHdl and u_nk, while maintaining the current extract_dHdL and extract_u_nk which will simply call the main function with the optional switch to extract only what’s needed.
AMBER output can optionally report every ntave steps the average of dHdl and other parameters, right now parsing dHdl we are collecting these values (here we are also collecting different components reported by AMBER), but we are not using this information at all in the code… I think we could just skip the reading of the averages since we are collecting all the values (and then we’ll perform better equilibration/subsampling analysis).

I’m working on implementing those changes, please let me know if you think anything else should be changed (or if I should just not touch anything 😃 )!

Issue Analytics

State:
Created a year ago
Comments:28 (28 by maintainers)

Top GitHub Comments

1reaction

xiki-tempulacommented, Sep 13, 2022

I’m concerned with NAMD parser not having a dHdl reader so we cannot have read_dHdl_and_u_nk for NAMD. I think a better approach might be to have extract(file, T) which returns a dictionary, which mimics the pymbar4 interface where things are stored as a dictionary. For not NAMD parsers, extract(file, T) returns {‘dHdl’: dHdl, ‘u_nk’: u_nk}, for NAMD, it will just be {‘u_nk’: u_nk}. Then we could have a common interface and it could also cope with the case when dHdl parser is not present.

1reaction

DrDomenicoMarsoncommented, Sep 13, 2022

I understand that everything has to be made in very small steps, I’m sorry I didn’t understand this earlier, but I thought the AMBER file parsing was a little bit “left behind” here as there are many small different issues to be addressed. That’s the reason 1 year ago or so I gave up on using alchemlyb to analyze my AMBER simulations as I hadn’t the time to check what was wrong (I couldn’t concatenate properly different files, ntpr problem. among other things).

I thought I could just fix (or try to 😃 ) different things in one go, I’m sorry.

I think it would be better to close this issue as “not planned”, and I can re-open different issues for every simple thing I can address in small fixes.

For now, I leave this open, but I opened two issues addressing points 1)/2), and 4), leaving this issue just for point 3). let me know if it’s better to close it and open another dedicated issue!

Top Results From Across the Web

Improved Amber prmtop support · Issue #839 · mdtraj ...

Would you be amenable to me replacing the Amber topology file parser with a (significantly) stripped-down version of mine from ParmEd?

5.20. AMBER PRMTOP topology parser

By default, the input force is modified in place and also returned. In-place operations improve performance because allocating new arrays is avoided. New...

The parmed.amber package - GitHub Pages

A class that can parse and print files stored in the Amber topology or MDL format. AmberParm ([prm_name, xyz, box, rst7_name]). Amber Topology...

prmtop.pdf

This appendix details the Parameter-Topology file format used exten- sively by the AMBER software suite for biomolecular simulation and analy-.

parm

[name <setname>] Optional name that can be used to refer to the topology in place of the file name. ... (default 0.2 Å)....