Trying to train on gen2 dataset
See original GitHub issueHi,
I am trying to overfit espaloma to a small batch from gen2 dataset. I noticed that the reference energy u_ref is in large negative scale:
ds = esp.data.dataset.GraphDataset.load("gen2")
ds = ds[:10]
ds.shuffle(seed=2666)
ds_tr, ds_vl, ds_te = ds.split([5, 3, 2])
ds_tr_loader = ds_tr.view(batch_size=1, shuffle=True)
ds_vl_loader = ds_vl.view(batch_size=1, shuffle=True)
g_tr = next(iter(ds_tr.view(batch_size=1)))
g_tr = next(iter(ds_tr.view(batch_size=1)))
torch.mean(g_tr.nodes["g"].data['u_ref'], dim=1)
tensor([-1988.4373])
(side note, when I try to increase the batch size I get the following error)
Expect all graphs to have the same schema on nodes["g"].data, but graph 1 got
{'u_openff-1.2.0': Scheme(shape=(29,), dtype=torch.float32), 'u_gaff-2.11': Scheme(shape=(29,), dtype=torch.float32), 'u_qm': Scheme(shape=(29,), dtype=torch.float32), 'u_ref': Scheme(shape=(29,), dtype=torch.float32), 'u_gaff-1.81': Scheme(shape=(29,), dtype=torch.float32)}
which is different from
{'u_openff-1.2.0': Scheme(shape=(77,), dtype=torch.float32), 'u_gaff-2.11': Scheme(shape=(77,), dtype=torch.float32), 'u_qm': Scheme(shape=(77,), dtype=torch.float32), 'u_ref': Scheme(shape=(77,), dtype=torch.float32), 'u_gaff-1.81': Scheme(shape=(77,), dtype=torch.float32)}.
The model that I am using is initialized the following way:
representation = esp.nn.Sequential(
layer=esp.nn.layers.dgl_legacy.gn("SAGEConv"), # use SAGEConv implementation in DGL
config=[128, "relu", 128, "relu", 128, "relu"], # 3 layers, 128 units, ReLU activation
)
readout = esp.nn.readout.janossy.JanossyPooling(
in_features=128, config=[128, "relu", 128, "relu", 128, "relu"],
out_features={ # define modular MM parameters Espaloma will assign
1: {"e": 1, "s": 1}, # atom hardness and electronegativity
2: {"log_coefficients": 2}, # bond linear combination, enforce positive
3: {"log_coefficients": 2}, # angle linear combination, enforce positive
4: {"k": 6}, # torsion barrier heights (can be positive or negative)
},
)
espaloma_model = torch.nn.Sequential(
representation, readout,
esp.nn.readout.janossy.ExpCoefficients(),
esp.mm.geometry.GeometryInGraph(),
esp.mm.energy.EnergyInGraph(),
#esp.mm.energy.EnergyInGraph(suffix="_ref"),
esp.nn.readout.charge_equilibrium.ChargeEquilibrium(),
)
Now I am trying to overfit and I am training the following model:
normalize = esp.data.normalize.ESOL100LogNormalNormalize()
for idx_epoch in tqdm(range(2000)):
intr_loss = 0
k = 0
for g in ds_tr_loader:
optimizer.zero_grad()
if torch.cuda.is_available():
g = g.to("cuda:0")
g = espaloma_model(g)
g = normalize.unnorm(g)
loss = loss_fn(g)
loss.backward()
optimizer.step()
intr_loss += loss.item()
I am using the following loss function:
loss_fn = esp.metrics.GraphMetric(
base_metric=torch.nn.MSELoss(),
between=['u', "u_ref"],
level="g",)
After training the train loss plot looks the following (epochs on the x-axis):
The loss gets stuck at ~1.4M when you would expect it to be close to 0 (since I am training only on 5 examples). The energy for individual examples converges at some small positive value:
g_tr = espaloma_model(g_tr)
g_tr.nodes["g"].data["u"]
1.2619
If I do the same but on pepconf dataset (peptides) and get similar results. The output of espaloma is on a different scale.
My question is, what am I doing wrong? Is it the model architecture? The normalizer? Or smth else? Would appreciate any help.
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
The updated Colab notebook should work! http://data.wangyq.net/esp_notesbooks/qm_fitting.ipynb
Essentially the energies need to be centered before calculating the error.
Reopening this issue to make sure we address the most recent comment!