question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

About Planetoid edge_index utilization

See original GitHub issue

Hi!

Thank you again for the flexible and interesting implementation of SEAL and the possibility of using custom datasets! I really enjoy it!

In the seal implementation, you are using the Planetoid dataset. May I ask why you are replacing the edge_index by split_edge['train']['edge'].t(), cf. edge index. Is this line specific to the Planetoid dataset?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
bits-glitchcommented, Aug 2, 2021

Thank you very much for the explanation! Now, I have finally understood why you do not apply data.edge_index = split_edge['train']['edge'].t() to the ogb datasets. I discovered that the edge_index in the ogb_datasets is already reduced (i.e. it already only contains training edges, and not all edges of the complete graph), which I was not aware of.

For example, if we take ogbl-ddi, we get:

  • Train Split Edge Shape torch.Size([1067911, 2])
  • Test Split Edge Shape torch.Size([133489, 2])
  • Valid Split Shape torch.Size([133489, 2])

If I print the data object after data = dataset[0], I get the following result: Data(edge_index=[2, 2135822]) --> 2135822 / 2 = 1067911 (directed links), thus containing only the training links.

Thanks again for the explanation!

0reactions
muhanzhangcommented, Aug 2, 2021

Because the OGB datasets already have pre-specified train/test/val splits, and you can use dataset.get_edge_split() to retrieve it from dataset. In contrast, you need to do the splitting for custom datasets yourself using do_edge_split(). The OGB datasets have a data = dataset[0] which contains the observed network. Uinsg dataset.get_edge_split() you can also get its train/val/test edge splits. The train edges are subset of the observed network.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Introduction by Example — pytorch_geometric documentation
edge_index , i.e. the tensor defining the source and target nodes of all edges, is not a list of index tuples. If you...
Read more >
Hands-on Graph Neural Networks with PyTorch Geometric (1)
The edge information is unique to graph data. The number of edges seems to be 10556. Let's find out why the number of...
Read more >
Understanding and Implementing Graph Neural Network
Edge attributes: the value of edge (eg. the distance in meters needed to travel from Point A to Point B). Constructing Simple Graphs...
Read more >
Introduction — DeepSNAP 0.2.0 documentation
, then without “secure split” , the number of edges in each splitted graph will be 4, 0, 1, resulting in one splitted...
Read more >
pytorch_geometric/planetoid.py at master - datasets - GitHub
Nodes represent documents and edges represent citation links. Training, validation and test splits are given by binary masks.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found