Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error with VGAE encoder

See original GitHub issue

🐛 Bug

Error seems to occur with the encoder

data = Data(x=features, edge_index=A)

transform = T.RandomLinkSplit(split_labels=True)
train_data, val_data, test_data = transform(data)

model = VGAE(Encoder(num_features, out_channels))  

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
pos_edge_label_index = train_data.pos_edge_label_index.to(device)

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

def train():
    model.train()
    optimizer.zero_grad()
    print(x)
    print(train_data.pos_edge_label_index)
    z = model.encode(x, train_data.pos_edge_label_index)
    print(z)
    loss = model.recon_loss(z, train_data.pos_edge_label_index, train_data.neg_edge_label_index)
    
    loss = loss + (1 / data.num_nodes) * model.kl_loss()  
    loss.backward()
    optimizer.step()
    return float(loss)

def test(pos_edge_label_index, neg_edge_label_index):
    model.eval()
    with torch.no_grad():
        z = model.encode(x, train_data.pos_edge_label_index)
        results = model.test(z, pos_edge_label_index, neg_edge_label_index)
    return results[0], results[1], z

Z = []
AUC = []
AP = []
for epoch in range(1, epochs + 1):
    loss = train()
    auc, ap, z = test(test_data.pos_edge_label_index, test_data.neg_edge_label_index)
    Z.append(z)
    AUC.append(auc)
    AP.append(ap)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, auc, ap))
    print(z)

Expected behavior

I expect the model to run and return the latent embeddings (z), auc, and ap

I get this error:

RuntimeError                              Traceback (most recent call last)
<ipython-input-48-1ac11cf2e261> in <module>
      3 AP = []
      4 for epoch in range(1, epochs + 1):
----> 5     loss = train()
      6     auc, ap, z = test(test_data.pos_edge_label_index, test_data.neg_edge_label_index)
      7     Z.append(z)

<ipython-input-47-3aaa6b2822ab> in train()
      4     print(x)
      5     print(train_data.pos_edge_label_index)
----> 6     z = model.encode(x, train_data.pos_edge_label_index)
      7     print(z)
      8     loss = model.recon_loss(z, train_data.pos_edge_label_index, train_data.neg_edge_label_index)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/models/autoencoder.py in encode(self, *args, **kwargs)
    154     def encode(self, *args, **kwargs):
    155         """"""
--> 156         self.__mu__, self.__logstd__ = self.encoder(*args, **kwargs)
    157         self.__logstd__ = self.__logstd__.clamp(max=MAX_LOGSTD)
    158         z = self.reparametrize(self.__mu__, self.__logstd__)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-17-d1f128ad6581> in forward(self, x, edge_index)
      7 
      8     def forward(self, x, edge_index):
----> 9         x = self.conv1(x, edge_index).relu()
     10         return self.conv_mu(x, edge_index), self.conv_logvar(x, edge_index)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py in forward(self, x, edge_index, edge_weight)
    162                     edge_index, edge_weight = gcn_norm(  # yapf: disable
    163                         edge_index, edge_weight, x.size(self.node_dim),
--> 164                         self.improved, self.add_self_loops)
    165                     if self.cached:
    166                         self._cached_edge_index = (edge_index, edge_weight)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py in gcn_norm(edge_index, edge_weight, num_nodes, improved, add_self_loops, dtype)
     60 
     61         row, col = edge_index[0], edge_index[1]
---> 62         deg = scatter_add(edge_weight, col, dim=0, dim_size=num_nodes)
     63         deg_inv_sqrt = deg.pow_(-0.5)
     64         deg_inv_sqrt.masked_fill_(deg_inv_sqrt == float('inf'), 0)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_scatter/scatter.py in scatter_add(src, index, dim, out, dim_size)
     27                 out: Optional[torch.Tensor] = None,
     28                 dim_size: Optional[int] = None) -> torch.Tensor:
---> 29     return scatter_sum(src, index, dim, out, dim_size)
     30 
     31 

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_scatter/scatter.py in scatter_sum(src, index, dim, out, dim_size)
     19             size[dim] = int(index.max()) + 1
     20         out = torch.zeros(size, dtype=src.dtype, device=src.device)
---> 21         return out.scatter_add_(dim, index, src)
     22     else:
     23         return out.scatter_add_(dim, index, src)

RuntimeError: index 1269 is out of bounds for dimension 0 with size 913

The index changes if I restart my kernel (received 1100, 1046, etc.) so I don’t really know where it’s coming from either.

Environment

OS: 11.5.1
Python version: 3.7.10
PyTorch version: 1.9.0
CUDA/cuDNN version: ?
GCC version: Apple clang version 12.0.5 (clang-1205.0.22.9)?
Any other relevant information:

Additional context

I am getting this problem when running the model on my full dataset. Oddly, it works perfectly on a subset of my full data as well as on the CiteSeer dataset, but as soon as I change it to my full data (without modifying anything in the code) it stops working.

The shape of my train_data is: Data(x=[913, 1346], edge_index=[2, 18645], pos_edge_label=[18645], pos_edge_label_index=[2, 18645], neg_edge_label=[18645], neg_edge_label_index=[2, 18645])

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

snknitincommented, Jan 17, 2022

Yes. Thank you. Fortunately, i found my error.

I was building my own InMemoryDataset from .csv files and kept getting the RuntimeError: index 846 is out of bounds for dimension 0 with size 845 when passing the data through any model . There were 846 nodes but since i had to explicitly specify data[node_type].num_nodes = data[node_type].x.size(0) it picked 845. Turns out I didn’t do a reset_index(drop=True) after dropping duplicates in my Node features dataframe, creating an idx mapping for 846 in the (src,dst) edge mapping, and this csv file didn’t have a header, so the first row got discarded when reading the dataframe and shows only 845 nodes

I think the last two assert statements, in your message are missing a ] before the max. Also, shouldn’t it be for edge_type in data.metadata()[1] for HeteroData to get the edge_types?

0reactions

rusty1scommented, Jan 17, 2022

Thanks, I fixed my previous reply.

Top Results From Across the Web

possible mistake in examples/autoencoder.py #264 - GitHub

z = model.encode(x, edge_index) loss = model.recon_loss(z, data.train_pos_edge_index) if args.model in ['VGAE']:

Tutorial on Variational Graph Auto-Encoders | by Fanghao Han

Variational graph autoencoder (VGAE) applies the idea of VAE on graph-structured data, which significantly improves predictive performance ...

Error in Pytorch Geometric, graph variational autoencoder

The error occurs while running this code, and it happens at the z = model.encode(x, train_pos_edge_index) line

Autoencoders - Pytorch Geometric Tutorial

Graph Autoencoder (GAE) and Variational Graph Autoencoder (VGAE) ... In this tutorial, we present the theory behind Autoencoders, then we show how ...

Variational Graph Autoencoder loss get nan(tensorflow ...

I'm implementing VGAE(Variational Graph Autoencoder) in tensorflow. The article has a image of VGAE. I put the VGAE network architecture ...