Error when using pytorch lightning
See original GitHub issueI am trying to define an ANI model along with the AEVComputer (with cuda enabled) module within a Pytorch Lightning Module, but I am getting the following error:
RuntimeError: coordinates, species, and aev_params should be on the same device
I have seen that some of the parameters are registered as buffers, but some are not. Please let me know what you think.
Kev
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
PyTorch Lightning 1.4.1 crashes during training · Issue #8821
When I start training on 2 opus using pytorch-lightning 1.4.1 the training crashes after a few epochs. Note that this happens only on...
Read more >Lightning giving out of CUDA error - implementation help
I recently tried Pytorch lightning. I converted by pytorch code to pytorch lightning but it would give out of cuda error after few...
Read more >Using Pytorch Lightning in Google Colab shows unexpected ...
Using Pytorch Lightning in Google Colab shows unexpected errors, ... then I ran it with another dataset, and it gave me this error:...
Read more >LightningModule — PyTorch Lightning 1.8.6 documentation
It replicates some samples on some devices to make sure all devices have same batch size in case of uneven inputs. Validation Epoch-level...
Read more >Unable to log each run when using pytorch lightning integration
andb: ERROR Error communicating with wandb process wandb: ERROR For more info see: https://docs.wandb.ai/library/init#init-start-error Problem ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi, the error came from the openmm-ml wrapper. A temp fixed version work ONLY for GPU could be found at: https://github.com/yueyericardo/openmm-ml/commit/1d1d3f24f40becdcd8a36431c8d0900d98eb1304#diff-911692ca194bf903c77d038662969ad3277dcf2fa8b3b3048d95a5aa3af59de1
It is using cuaev
use_cuda_extension
for aev calculation, but it currently does not support pbc, so if you want to use cuaev, you have to change your script slightly toOur internal version has some other updates to make it faster, but it currently is not open source yet. In the meanwhile, openmm team is building NNPOPS for ani and schnet, you could track the progress here Add example of using NNPOps with openmm-torch?!
Edit: BTW, our conda-forge package includes the latest public build with cuaev: you could install it directly by
I am still getting the same issue I showed above while using an ANI model within pytorch lightning. Any ideas how to fix it?