question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue with GNN using own heterogeneous dataset

See original GitHub issue

I have generated a heterogeneous dataset based on some csv files. areaDZ noded have features while areaNoFeature do not have features.

HeteroData(
  areaNoFeature={ num_nodes=1316 },
  areaDZ={
    x=[5841, 23],
    y=[5841],
    train_mask=[5841],
    val_mask=[5841],
    test_mask=[5841]
  },
  (areaDZ, parent, areaNoFeature)={ edge_index=[2, 5841] },
  (areaNoFeature, parent, areaNoFeature)={ edge_index=[2, 2630] },
  (areaNoFeature, rev_parent, areaDZ)={ edge_index=[2, 5841] }
)

The masks are created using the RandomNodeSplit

transform = RandomNodeSplit(split='train_rest', num_val=100, num_test=0.25)
data = transform(data)

I try to train a GNN using some code provided at github but I get an error:

train_input_nodes = ('areaDZ', data['areaDZ'].train_mask)
val_input_nodes = ('areaDZ', data['areaDZ'].val_mask)
kwargs = {'batch_size': 32, 'num_workers': 6, 'persistent_workers': True}

train_loader = NeighborLoader(data, num_neighbors=[10] * 2, shuffle=True, input_nodes=train_input_nodes, **kwargs)
val_loader = NeighborLoader(data, num_neighbors=[10] * 2, input_nodes=val_input_nodes, **kwargs)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = Sequential('x, edge_index', [
    (SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
    ReLU(inplace=True),
    (SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
    ReLU(inplace=True),
    (Linear(-1, 2), 'x -> x'),
])
model = to_hetero(model, data.metadata(), aggr='sum').to(device)

@torch.no_grad()
def init_params():
    # Initialize lazy parameters via forwarding a single batch to the model:
    batch = next(iter(train_loader))
    batch = batch.to(device)
    model(batch.x_dict, batch.edge_index_dict)

def train():
    model.train()
    total_examples = total_loss = 0
    for batch in tqdm(train_loader):
        optimizer.zero_grad()
        batch = batch.to(device)
        batch_size = batch['areaDZ'].batch_size
        out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
        loss = F.cross_entropy(out, batch['areaDZ'].y[:batch_size])
        loss.backward()
        optimizer.step()

        total_examples += batch_size
        total_loss += float(loss) * batch_size

    return total_loss / total_examples


@torch.no_grad()
def test(loader):
    model.eval()

    total_examples = total_correct = 0
    for batch in tqdm(loader):
        batch = batch.to(device)
        batch_size = batch['areaDZ'].batch_size
        out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
        pred = out.argmax(dim=-1)

        total_examples += batch_size
        total_correct += int((pred == batch['areaDZ'].y[:batch_size]).sum())

    return total_correct / total_examples

init_params()  # Initialize parameters.
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(1, 21):
    loss = train()
    val_acc = test(val_loader)
    print(f'Epoch: {epoch:02d}, Loss: {loss:.4f}, Val: {val_acc:.4f}')

The error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_2445/3976418620.py in <module>
----> 1 init_params()  # Initialize parameters.
      2 optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
      3 
      4 for epoch in range(1, 21):
      5     loss = train()

/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     26         def decorate_context(*args, **kwargs):
     27             with self.__class__():
---> 28                 return func(*args, **kwargs)
     29         return cast(F, decorate_context)
     30 

/tmp/ipykernel_2445/1078389677.py in init_params()
     29     batch = next(iter(train_loader))
     30     batch = batch.to(device)
---> 31     model(batch.x_dict, batch.edge_index_dict)
     32 
     33 

/opt/conda/lib/python3.7/site-packages/torch/fx/graph_module.py in wrapped_call(self, *args, **kwargs)
    511                     print(generate_error_message(topmost_framesummary),
    512                           file=sys.stderr)
--> 513                 raise e.with_traceback(None)
    514 
    515         cls.__call__ = wrapped_call

AttributeError: 'NoneType' object has no attribute 'dim'

Any help will be much appreciated!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
rusty1scommented, Aug 14, 2022

Yes, I think this is related to missing edge types that point to node type B. I think this experience will be smoother in the upcoming release.

Within RandomLinkSplit, you can use None as the rev_edge_type for A<>A and C<>C.

In addition, I don‘t think you need to necessarily create a new model for predicting every edge type. You can potentially train this end-to-end together.

1reaction
rusty1scommented, Nov 30, 2021

All node types will need to have some features to enable message passing. In case they are not given, you can use torch.nn.Embedding to learn them, e.g.:

class MyModel(torch.nn.Module):
    def __init__(self, ...):
        self.emb = Embedding(1316, 64)

        self.model = to_hetero(model)
        
    def forward(self, x_dict, edge_index_dict):
        x_dict = copy.copy(x_dict)
        x_dict["areaNoFeature"] = self.emb.weight
        return self.model(x_dict, edge_index_dict)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Heterogeneous Graph Learning - PyTorch Geometric
A large set of real-world datasets are stored as heterogeneous graphs, motivating the ... Deploy existing (or write your own) heterogeneous GNN operators....
Read more >
Heterogeneous Graph Neural Network via Attribute Completion
Recently, many excellent models have been proposed to process hetero-graph data using GNNs and have achieved great success. These GNN-based heterogeneous models ...
Read more >
Topic-aware Heterogeneous Graph Neural Network for Link ...
Experiments on real heterogeneous graph datasets demonstrate that our proposed model significantly outperforms state-of-the- art methods in link ...
Read more >
Heterogeneous graph learning [Advanced PyTorch Geometric ...
We have discussed Heterogeneous Graphs Learning. In particular, we show how Heterogeneous Graphs in Pytorch Geometric are loaded and their ...
Read more >
Heterogeneous Graph Neural Network - ACM Digital Library
Extensive experiments on several datasets demonstrate that HetGNN can outperform state-of-the-art baselines in various graph mining tasks, i.e., ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found