question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error when using pytorch lightning

See original GitHub issue

I am trying to define an ANI model along with the AEVComputer (with cuda enabled) module within a Pytorch Lightning Module, but I am getting the following error:

RuntimeError: coordinates, species, and aev_params should be on the same device

I have seen that some of the parameters are registered as buffers, but some are not. Please let me know what you think.

Kev

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
yueyericardocommented, Jan 27, 2022

Hi, the error came from the openmm-ml wrapper. A temp fixed version work ONLY for GPU could be found at: https://github.com/yueyericardo/openmm-ml/commit/1d1d3f24f40becdcd8a36431c8d0900d98eb1304#diff-911692ca194bf903c77d038662969ad3277dcf2fa8b3b3048d95a5aa3af59de1

It is using cuaev use_cuda_extension for aev calculation, but it currently does not support pbc, so if you want to use cuaev, you have to change your script slightly to

pdb = PDBFile("aaa.pdb")
# add this line
pdb.topology.setPeriodicBoxVectors(None)
potential = MLPotential('ani2x')

Our internal version has some other updates to make it faster, but it currently is not open source yet. In the meanwhile, openmm team is building NNPOPS for ani and schnet, you could track the progress here Add example of using NNPOps with openmm-torch?!

Edit: BTW, our conda-forge package includes the latest public build with cuaev: you could install it directly by

conda install -c conda-forge torchani
0reactions
kryczkocommented, Mar 2, 2022

I am still getting the same issue I showed above while using an ANI model within pytorch lightning. Any ideas how to fix it?

Read more comments on GitHub >

github_iconTop Results From Across the Web

PyTorch Lightning 1.4.1 crashes during training · Issue #8821
When I start training on 2 opus using pytorch-lightning 1.4.1 the training crashes after a few epochs. Note that this happens only on...
Read more >
Lightning giving out of CUDA error - implementation help
I recently tried Pytorch lightning. I converted by pytorch code to pytorch lightning but it would give out of cuda error after few...
Read more >
Using Pytorch Lightning in Google Colab shows unexpected ...
Using Pytorch Lightning in Google Colab shows unexpected errors, ... then I ran it with another dataset, and it gave me this error:...
Read more >
LightningModule — PyTorch Lightning 1.8.6 documentation
It replicates some samples on some devices to make sure all devices have same batch size in case of uneven inputs. Validation Epoch-level...
Read more >
Unable to log each run when using pytorch lightning integration
andb: ERROR Error communicating with wandb process wandb: ERROR For more info see: https://docs.wandb.ai/library/init#init-start-error Problem ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found