question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to convert MNIST dataset into PyG dataset?

See original GitHub issue

I want to convert MNIST dataset into PYG dataset. The idea behind this to replicate Defferrard et al’s result (https://arxiv.org/pdf/1606.09375.pdf) [Table 1. Proposed graph CNN GC32-P4-GC64-P4-FC512]. I have an Adjacency matrix which is 784 X 784. The adjacency matrix is here: https://bit.ly/2SIKPgH, which can be converted to edge list as follows as per @rusty1s suggestion:

import scipy.io
adjt8 = scipy.io.loadmat('adjt8.mat')
adjt8 = adjt8['Expression1']
from scipy import sparse
adj81 = sparse.coo_matrix(adjt8)
row = torch.from_numpy(adj81.row).long()
col = torch.from_numpy(adj81.col).long()
edge_index = torch.stack([row,col],dim=0)
#Importing MNIST Data
!pip install spektral
from spektral.datasets import mnist
X_train, y_train, X_val, y_val, X_test, y_test, A = mnist.load_data()

Creating MNIST dataset:

from torch_geometric.data import InMemoryDataset

class MNISTDataset(InMemoryDataset):
    def __init__(self, root, transform=None, pre_transform=None):
        super(MNISTDataset, self).__init__(root, transform, pre_transform)
        self.data, self.slices = torch.load(self.processed_paths[0])

    @property
    def raw_file_names(self):
        return []

    @property
    def processed_file_names(self):
        return ['/tmp/mdata.pt']

    def download(self):
        pass

    def process(self):

        data_list= []
        for i in range(0, len(X_train)):
            data = Data(x=torch.tensor(X_train[i],dtype=torch.float).view(784,1),edge_index=edge_index1,y=torch.tensor(y_train[i]).view(1).long())
            data_list.append(data)

        data, slices = self.collate(data_list)
        torch.save((data, slices), self.processed_paths[0])

I am to trying to implement something similar to MNIST superpixel graclus example (https://bit.ly/3bdlQc6) that @rusty1s implemented. However, for my simple MNIST there is going to be only one edge_index and no edge_attr. I also do not have pos. For normalized cut and graclus I need pos. The superpixel MNIST dataset comes with pos and edge_attr.

How do I go about creating MNIST dataset for PyG?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:18 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
jasperhypcommented, May 7, 2021

Sure, the flattening approach should work better compared to a non-hierachical global pooling variant. You can do this here too, of course. The downside of the flattening approach is that you lose translation invariance, similar to an MLP compared to a CNN.

Hi Matthias, Thanks for creating both PyG and Spektral. I have a question related to implementing Flatten() in PyTorch here… I want to add a Flatten layer instead of some pooling layer after a GCNConv, but since I am using batches to train (train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle = True)), the torch.flatten() function would mix all batches together, and torch.nn.Flatten() (layer) would not actually decrease the dimension of tensor as in Spektral, where 3-D tensor [batch_size, node_num, node_atttribute_num] is converted to 2-D [batch_size, node_num*node_attribute_num]. It might not be a problem directly related to Spketral or PyG (indeed, those are built-in functions of torch and keras), but I am wondering if you can help me with this issue? If I just go with torch.flatten(), the final output would only contain one result for batch_size of training samples… Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to convert MNIST dataset into PyG dataset? #1202 - GitHub
I want to convert MNIST dataset into PYG dataset. ... The superpixel MNIST dataset comes with pos and edge_attr.
Read more >
torch_geometric.datasets — pytorch_geometric documentation
MNIST superpixels dataset from the “Geometric Deep Learning on Graphs and Manifolds ... To convert the mesh to a graph, use the torch_geometric.transforms....
Read more >
Data handling in PyTorch Geometric (Part 2) - YouTube
In this second talk in data handling with pyg we show how to load your own dataset from scratch, and illustrate what are...
Read more >
Hands-On Guide to PyTorch Geometric (With Python Code) -
PyTorch Geometric(PyG) is a python framework for deep learning on irregular structures like graphs, point clouds and manifolds.
Read more >
Converting MNIST dataset for Handwritten digit recognition in ...
Tutorial on how to convert MNIST Dataset from IDX format to Python Numpy Array.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found