Issue with GNN using own heterogeneous dataset
See original GitHub issueI have generated a heterogeneous dataset based on some csv files.
areaDZ
noded have features while areaNoFeature
do not have features.
HeteroData(
areaNoFeature={ num_nodes=1316 },
areaDZ={
x=[5841, 23],
y=[5841],
train_mask=[5841],
val_mask=[5841],
test_mask=[5841]
},
(areaDZ, parent, areaNoFeature)={ edge_index=[2, 5841] },
(areaNoFeature, parent, areaNoFeature)={ edge_index=[2, 2630] },
(areaNoFeature, rev_parent, areaDZ)={ edge_index=[2, 5841] }
)
The masks are created using the RandomNodeSplit
transform = RandomNodeSplit(split='train_rest', num_val=100, num_test=0.25)
data = transform(data)
I try to train a GNN using some code provided at github but I get an error:
train_input_nodes = ('areaDZ', data['areaDZ'].train_mask)
val_input_nodes = ('areaDZ', data['areaDZ'].val_mask)
kwargs = {'batch_size': 32, 'num_workers': 6, 'persistent_workers': True}
train_loader = NeighborLoader(data, num_neighbors=[10] * 2, shuffle=True, input_nodes=train_input_nodes, **kwargs)
val_loader = NeighborLoader(data, num_neighbors=[10] * 2, input_nodes=val_input_nodes, **kwargs)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Sequential('x, edge_index', [
(SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
ReLU(inplace=True),
(SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
ReLU(inplace=True),
(Linear(-1, 2), 'x -> x'),
])
model = to_hetero(model, data.metadata(), aggr='sum').to(device)
@torch.no_grad()
def init_params():
# Initialize lazy parameters via forwarding a single batch to the model:
batch = next(iter(train_loader))
batch = batch.to(device)
model(batch.x_dict, batch.edge_index_dict)
def train():
model.train()
total_examples = total_loss = 0
for batch in tqdm(train_loader):
optimizer.zero_grad()
batch = batch.to(device)
batch_size = batch['areaDZ'].batch_size
out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
loss = F.cross_entropy(out, batch['areaDZ'].y[:batch_size])
loss.backward()
optimizer.step()
total_examples += batch_size
total_loss += float(loss) * batch_size
return total_loss / total_examples
@torch.no_grad()
def test(loader):
model.eval()
total_examples = total_correct = 0
for batch in tqdm(loader):
batch = batch.to(device)
batch_size = batch['areaDZ'].batch_size
out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
pred = out.argmax(dim=-1)
total_examples += batch_size
total_correct += int((pred == batch['areaDZ'].y[:batch_size]).sum())
return total_correct / total_examples
init_params() # Initialize parameters.
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
for epoch in range(1, 21):
loss = train()
val_acc = test(val_loader)
print(f'Epoch: {epoch:02d}, Loss: {loss:.4f}, Val: {val_acc:.4f}')
The error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_2445/3976418620.py in <module>
----> 1 init_params() # Initialize parameters.
2 optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
3
4 for epoch in range(1, 21):
5 loss = train()
/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
26 def decorate_context(*args, **kwargs):
27 with self.__class__():
---> 28 return func(*args, **kwargs)
29 return cast(F, decorate_context)
30
/tmp/ipykernel_2445/1078389677.py in init_params()
29 batch = next(iter(train_loader))
30 batch = batch.to(device)
---> 31 model(batch.x_dict, batch.edge_index_dict)
32
33
/opt/conda/lib/python3.7/site-packages/torch/fx/graph_module.py in wrapped_call(self, *args, **kwargs)
511 print(generate_error_message(topmost_framesummary),
512 file=sys.stderr)
--> 513 raise e.with_traceback(None)
514
515 cls.__call__ = wrapped_call
AttributeError: 'NoneType' object has no attribute 'dim'
Any help will be much appreciated!
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (5 by maintainers)
Top Results From Across the Web
Heterogeneous Graph Learning - PyTorch Geometric
A large set of real-world datasets are stored as heterogeneous graphs, motivating the ... Deploy existing (or write your own) heterogeneous GNN operators....
Read more >Heterogeneous Graph Neural Network via Attribute Completion
Recently, many excellent models have been proposed to process hetero-graph data using GNNs and have achieved great success. These GNN-based heterogeneous models ...
Read more >Topic-aware Heterogeneous Graph Neural Network for Link ...
Experiments on real heterogeneous graph datasets demonstrate that our proposed model significantly outperforms state-of-the- art methods in link ...
Read more >Heterogeneous graph learning [Advanced PyTorch Geometric ...
We have discussed Heterogeneous Graphs Learning. In particular, we show how Heterogeneous Graphs in Pytorch Geometric are loaded and their ...
Read more >Heterogeneous Graph Neural Network - ACM Digital Library
Extensive experiments on several datasets demonstrate that HetGNN can outperform state-of-the-art baselines in various graph mining tasks, i.e., ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, I think this is related to missing edge types that point to node type B. I think this experience will be smoother in the upcoming release.
Within
RandomLinkSplit
, you can useNone
as therev_edge_type
for A<>A and C<>C.In addition, I don‘t think you need to necessarily create a new model for predicting every edge type. You can potentially train this end-to-end together.
All node types will need to have some features to enable message passing. In case they are not given, you can use
torch.nn.Embedding
to learn them, e.g.: