Training on multiple graphs
See original GitHub issueFrom now on, we recommend using our discussion forum (https://github.com/rusty1s/pytorch_geometric/discussions) for general questions.
❓ Questions & Help
I am developing a model for the node classification task. I batch multiple graphs into the training and testing batches . After I train the model against one batch, I obtain some result that seem suspicious to me.
Let us say, my batch that contains the nodes for the training is given as follows:
Batch(batch=[5811], edge_attr=[8340, 1], edge_index=[2, 8340], ptr=[11], test_mask=[5811], train_mask=[5811], val_mask=[5811], x=[5811, 40], y=[5811])
It contains 10 graphs as can be seen in ptr
.
Next I train the model:
train_epoch = 200
for i in range (len(data_batched.ptr)-1):
loss_trained = np.zeros(train_epoch, dtype = float)
optimizer = torch.optim.Adam(modelGraphConv.parameters(), lr= 0.01, weight_decay=5e-4)
criterion = torch.nn.CrossEntropyLoss()
for epoch in range (1, train_epoch+1):
loss = train(data_batched[i], modelGraphConv)
loss_trained[epoch-1] = loss
print(f'Epoch: {epoch:03d}, Loss: {loss:.4f}')
print('=========================')
plt.plot(np.arange(train_epoch), loss_trained)
plt.show()
and the result of the first three trainings is depicted below
Let us ignore the high loss for now. The thing that confuses me the most, is that at the beginning of each training, the loss jumps back to the value approximately 2. I would expect it continuously going down (or at least remain at the same level), since the multiple graphs that I use for training comes from the same simulation.
So the question is: do I make a mistake in the programming, or is it my misunderstanding of the neural network performance?
Thank you!
Issue Analytics
- State:
- Created 2 years ago
- Comments:30 (14 by maintainers)
Top GitHub Comments
All right, so I have collapsed my 40 dimensional feature vector to 2D via T-SNE for one training graph and one test graph
and this is the result
If I can interpret the result correctly, it looks like the test set is not out-of-distribution, since both test and train data are distributed in a very similar manner.
Essentially, my node feature vector represents the coordinates in 2D, so before one-hot encoding (before bringing it up to 40 dimensions), it looks like this
In other words, picture above depicts the collected data.
At this moment, I cannot think about anything else I could try with the current dataset. If you got any other ideas by looking at my datasets regarding how to improve accuracy of the model or process/modify/analyze the data, I would widely appreciate if you shared. Otherwise, I thank you very much for helping me and we can close the thread.
The
final_edge_index
will use a threshold of 0.5 to decide whether to include an edge or not, which might be to low in your experiment. To only keep edges with higher probability, run: