How to convert MNIST dataset into PyG dataset?
See original GitHub issueI want to convert MNIST dataset into PYG dataset. The idea behind this to replicate Defferrard et al’s result (https://arxiv.org/pdf/1606.09375.pdf) [Table 1. Proposed graph CNN GC32-P4-GC64-P4-FC512]. I have an Adjacency matrix which is 784 X 784. The adjacency matrix is here: https://bit.ly/2SIKPgH, which can be converted to edge list as follows as per @rusty1s suggestion:
import scipy.io
adjt8 = scipy.io.loadmat('adjt8.mat')
adjt8 = adjt8['Expression1']
from scipy import sparse
adj81 = sparse.coo_matrix(adjt8)
row = torch.from_numpy(adj81.row).long()
col = torch.from_numpy(adj81.col).long()
edge_index = torch.stack([row,col],dim=0)
#Importing MNIST Data
!pip install spektral
from spektral.datasets import mnist
X_train, y_train, X_val, y_val, X_test, y_test, A = mnist.load_data()
Creating MNIST dataset:
from torch_geometric.data import InMemoryDataset
class MNISTDataset(InMemoryDataset):
def __init__(self, root, transform=None, pre_transform=None):
super(MNISTDataset, self).__init__(root, transform, pre_transform)
self.data, self.slices = torch.load(self.processed_paths[0])
@property
def raw_file_names(self):
return []
@property
def processed_file_names(self):
return ['/tmp/mdata.pt']
def download(self):
pass
def process(self):
data_list= []
for i in range(0, len(X_train)):
data = Data(x=torch.tensor(X_train[i],dtype=torch.float).view(784,1),edge_index=edge_index1,y=torch.tensor(y_train[i]).view(1).long())
data_list.append(data)
data, slices = self.collate(data_list)
torch.save((data, slices), self.processed_paths[0])
I am to trying to implement something similar to MNIST superpixel graclus example (https://bit.ly/3bdlQc6) that @rusty1s implemented. However, for my simple MNIST there is going to be only one edge_index and no edge_attr. I also do not have pos. For normalized cut and graclus I need pos. The superpixel MNIST dataset comes with pos and edge_attr.
How do I go about creating MNIST dataset for PyG?
Issue Analytics
- State:
- Created 3 years ago
- Comments:18 (8 by maintainers)
Top GitHub Comments
Hi Matthias, Thanks for creating both PyG and Spektral. I have a question related to implementing
Flatten()
in PyTorch here… I want to add aFlatten
layer instead of some pooling layer after aGCNConv
, but since I am using batches to train (train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle = True)
), thetorch.flatten()
function would mix all batches together, andtorch.nn.Flatten()
(layer) would not actually decrease the dimension of tensor as in Spektral, where 3-D tensor[batch_size, node_num, node_atttribute_num]
is converted to 2-D[batch_size, node_num*node_attribute_num]
. It might not be a problem directly related to Spketral or PyG (indeed, those are built-in functions of torch and keras), but I am wondering if you can help me with this issue? If I just go withtorch.flatten()
, the final output would only contain one result forbatch_size
of training samples… Thanks!See https://github.com/rusty1s/pytorch_spline_conv/issues/25