question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

Error with VGAE encoder

See original GitHub issue

šŸ› Bug

Error seems to occur with the encoder

data = Data(x=features, edge_index=A)

transform = T.RandomLinkSplit(split_labels=True)
train_data, val_data, test_data = transform(data)

model = VGAE(Encoder(num_features, out_channels))  

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
pos_edge_label_index = train_data.pos_edge_label_index.to(device)

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

def train():
    model.train()
    optimizer.zero_grad()
    print(x)
    print(train_data.pos_edge_label_index)
    z = model.encode(x, train_data.pos_edge_label_index)
    print(z)
    loss = model.recon_loss(z, train_data.pos_edge_label_index, train_data.neg_edge_label_index)
    
    loss = loss + (1 / data.num_nodes) * model.kl_loss()  
    loss.backward()
    optimizer.step()
    return float(loss)

def test(pos_edge_label_index, neg_edge_label_index):
    model.eval()
    with torch.no_grad():
        z = model.encode(x, train_data.pos_edge_label_index)
        results = model.test(z, pos_edge_label_index, neg_edge_label_index)
    return results[0], results[1], z

Z = []
AUC = []
AP = []
for epoch in range(1, epochs + 1):
    loss = train()
    auc, ap, z = test(test_data.pos_edge_label_index, test_data.neg_edge_label_index)
    Z.append(z)
    AUC.append(auc)
    AP.append(ap)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, auc, ap))
    print(z)

Expected behavior

I expect the model to run and return the latent embeddings (z), auc, and ap

I get this error:

RuntimeError                              Traceback (most recent call last)
<ipython-input-48-1ac11cf2e261> in <module>
      3 AP = []
      4 for epoch in range(1, epochs + 1):
----> 5     loss = train()
      6     auc, ap, z = test(test_data.pos_edge_label_index, test_data.neg_edge_label_index)
      7     Z.append(z)

<ipython-input-47-3aaa6b2822ab> in train()
      4     print(x)
      5     print(train_data.pos_edge_label_index)
----> 6     z = model.encode(x, train_data.pos_edge_label_index)
      7     print(z)
      8     loss = model.recon_loss(z, train_data.pos_edge_label_index, train_data.neg_edge_label_index)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/models/autoencoder.py in encode(self, *args, **kwargs)
    154     def encode(self, *args, **kwargs):
    155         """"""
--> 156         self.__mu__, self.__logstd__ = self.encoder(*args, **kwargs)
    157         self.__logstd__ = self.__logstd__.clamp(max=MAX_LOGSTD)
    158         z = self.reparametrize(self.__mu__, self.__logstd__)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-17-d1f128ad6581> in forward(self, x, edge_index)
      7 
      8     def forward(self, x, edge_index):
----> 9         x = self.conv1(x, edge_index).relu()
     10         return self.conv_mu(x, edge_index), self.conv_logvar(x, edge_index)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py in forward(self, x, edge_index, edge_weight)
    162                     edge_index, edge_weight = gcn_norm(  # yapf: disable
    163                         edge_index, edge_weight, x.size(self.node_dim),
--> 164                         self.improved, self.add_self_loops)
    165                     if self.cached:
    166                         self._cached_edge_index = (edge_index, edge_weight)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py in gcn_norm(edge_index, edge_weight, num_nodes, improved, add_self_loops, dtype)
     60 
     61         row, col = edge_index[0], edge_index[1]
---> 62         deg = scatter_add(edge_weight, col, dim=0, dim_size=num_nodes)
     63         deg_inv_sqrt = deg.pow_(-0.5)
     64         deg_inv_sqrt.masked_fill_(deg_inv_sqrt == float('inf'), 0)

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_scatter/scatter.py in scatter_add(src, index, dim, out, dim_size)
     27                 out: Optional[torch.Tensor] = None,
     28                 dim_size: Optional[int] = None) -> torch.Tensor:
---> 29     return scatter_sum(src, index, dim, out, dim_size)
     30 
     31 

~/opt/anaconda3/envs/stGCNG/lib/python3.7/site-packages/torch_scatter/scatter.py in scatter_sum(src, index, dim, out, dim_size)
     19             size[dim] = int(index.max()) + 1
     20         out = torch.zeros(size, dtype=src.dtype, device=src.device)
---> 21         return out.scatter_add_(dim, index, src)
     22     else:
     23         return out.scatter_add_(dim, index, src)

RuntimeError: index 1269 is out of bounds for dimension 0 with size 913

The index changes if I restart my kernel (received 1100, 1046, etc.) so I donā€™t really know where itā€™s coming from either.

Environment

  • OS: 11.5.1
  • Python version: 3.7.10
  • PyTorch version: 1.9.0
  • CUDA/cuDNN version: ?
  • GCC version: Apple clang version 12.0.5 (clang-1205.0.22.9)?
  • Any other relevant information:

Additional context

I am getting this problem when running the model on my full dataset. Oddly, it works perfectly on a subset of my full data as well as on the CiteSeer dataset, but as soon as I change it to my full data (without modifying anything in the code) it stops working.

The shape of my train_data is: Data(x=[913, 1346], edge_index=[2, 18645], pos_edge_label=[18645], pos_edge_label_index=[2, 18645], neg_edge_label=[18645], neg_edge_label_index=[2, 18645])

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
snknitincommented, Jan 17, 2022

Yes. Thank you. Fortunately, i found my error.

I was building my own InMemoryDataset from .csv files and kept getting the RuntimeError: index 846 is out of bounds for dimension 0 with size 845 when passing the data through any model . There were 846 nodes but since i had to explicitly specify data[node_type].num_nodes = data[node_type].x.size(0) it picked 845. Turns out I didnā€™t do a reset_index(drop=True) after dropping duplicates in my Node features dataframe, creating an idx mapping for 846 in the (src,dst) edge mapping, and this csv file didnā€™t have a header, so the first row got discarded when reading the dataframe and shows only 845 nodes

I think the last two assert statements, in your message are missing a ] before the max. Also, shouldnā€™t it be for edge_type in data.metadata()[1] for HeteroData to get the edge_types?

0reactions
rusty1scommented, Jan 17, 2022

Thanks, I fixed my previous reply.

Read more comments on GitHub >

github_iconTop Results From Across the Web

possible mistake in examples/autoencoder.py #264 - GitHub
z = model.encode(x, edge_index) loss = model.recon_loss(z, data.train_pos_edge_index) if args.model in ['VGAE']:
Read more >
Tutorial on Variational Graph Auto-Encoders | by Fanghao Han
Variational graph autoencoder (VGAE) applies the idea of VAE on graph-structured data, which significantly improves predictive performanceĀ ...
Read more >
Error in Pytorch Geometric, graph variational autoencoder
The error occurs while running this code, and it happens at the z = model.encode(x, train_pos_edge_index) line
Read more >
Autoencoders - Pytorch Geometric Tutorial
Graph Autoencoder (GAE) and Variational Graph Autoencoder (VGAE) ... In this tutorial, we present the theory behind Autoencoders, then we show howĀ ...
Read more >
Variational Graph Autoencoder loss get nan(tensorflow ...
I'm implementing VGAE(Variational Graph Autoencoder) in tensorflow. The article has a image of VGAE. I put the VGAE network architectureĀ ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found