question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hi, Is it possible to build my own dataset via torchani? Is it necessary to use the built-in ANI dataset to train models? I did not find any information in the document.

Thanks.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
zasdfgbnmcommented, Nov 5, 2019

@njzjz You can have your own dataset in any format. But our dataset loader only supports the hdf5 we are using, as @farhadrgh mentioned, they should have have the same key as our format. But using our data loader is not mandatory, and you can always load your data in your favorite way, convert it to PyTorch tensors, and feed it to your training pipeline.

1reaction
farhadrghcommented, Nov 5, 2019

Yes, you can create your dataset and use it for training (not with TorchANI though) as long as each entry has the same keys as the sample HDF5 datasets. This should be done using libraries like h5py.

The required keys for compatible datasets are: coordinates, species, energies, and forces.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to build your own dataset for Data Science projects
You want to begin with a project, construct a model and run for the results and actively looking for a dataset? Why not...
Read more >
Creating datasets | BigQuery - Google Cloud
Open the BigQuery page in the Google Cloud console. Go to the BigQuery page · In the Explorer panel, select the project where...
Read more >
How to Create a Dataset for Machine Learning - Section.io
This article gives an overview of how datasets are created for Machine Learning models. Having good quality data is very important to ML ......
Read more >
Preparing Your Dataset for Machine Learning: 10 Steps
Preparing Your Dataset for Machine Learning: 10 Basic Techniques That Make Your Data Better · 1. Articulate the problem early · 2. Establish...
Read more >
Build and load - Hugging Face
When you load a dataset for the first time, Datasets takes the raw data file and builds it into a table of rows...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found