Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue with GNN using own heterogeneous dataset

See original GitHub issue

I have generated a heterogeneous dataset based on some csv files. areaDZ noded have features while areaNoFeature do not have features.

HeteroData(
  areaNoFeature={ num_nodes=1316 },
  areaDZ={
    x=[5841, 23],
    y=[5841],
    train_mask=[5841],
    val_mask=[5841],
    test_mask=[5841]
  },
  (areaDZ, parent, areaNoFeature)={ edge_index=[2, 5841] },
  (areaNoFeature, parent, areaNoFeature)={ edge_index=[2, 2630] },
  (areaNoFeature, rev_parent, areaDZ)={ edge_index=[2, 5841] }
)

The masks are created using the RandomNodeSplit

transform = RandomNodeSplit(split='train_rest', num_val=100, num_test=0.25)
data = transform(data)

I try to train a GNN using some code provided at github but I get an error:

train_input_nodes = ('areaDZ', data['areaDZ'].train_mask)
val_input_nodes = ('areaDZ', data['areaDZ'].val_mask)
kwargs = {'batch_size': 32, 'num_workers': 6, 'persistent_workers': True}

train_loader = NeighborLoader(data, num_neighbors=[10] * 2, shuffle=True, input_nodes=train_input_nodes, **kwargs)
val_loader = NeighborLoader(data, num_neighbors=[10] * 2, input_nodes=val_input_nodes, **kwargs)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = Sequential('x, edge_index', [
    (SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
    ReLU(inplace=True),
    (SAGEConv((-1, -1), 64), 'x, edge_index -> x'),
    ReLU(inplace=True),
    (Linear(-1, 2), 'x -> x'),
])
model = to_hetero(model, data.metadata(), aggr='sum').to(device)

@torch.no_grad()
def init_params():
    # Initialize lazy parameters via forwarding a single batch to the model:
    batch = next(iter(train_loader))
    batch = batch.to(device)
    model(batch.x_dict, batch.edge_index_dict)

def train():
    model.train()
    total_examples = total_loss = 0
    for batch in tqdm(train_loader):
        optimizer.zero_grad()
        batch = batch.to(device)
        batch_size = batch['areaDZ'].batch_size
        out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
        loss = F.cross_entropy(out, batch['areaDZ'].y[:batch_size])
        loss.backward()
        optimizer.step()

        total_examples += batch_size
        total_loss += float(loss) * batch_size

    return total_loss / total_examples


@torch.no_grad()
def test(loader):
    model.eval()

    total_examples = total_correct = 0
    for batch in tqdm(loader):
        batch = batch.to(device)
        batch_size = batch['areaDZ'].batch_size
        out = model(batch.x_dict, batch.edge_index_dict)['areaDZ'][:batch_size]
        pred = out.argmax(dim=-1)

        total_examples += batch_size
        total_correct += int((pred == batch['areaDZ'].y[:batch_size]).sum())

    return total_correct / total_examples

init_params()  # Initialize parameters.
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(1, 21):
    loss = train()
    val_acc = test(val_loader)
    print(f'Epoch: {epoch:02d}, Loss: {loss:.4f}, Val: {val_acc:.4f}')

The error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_2445/3976418620.py in <module>
----> 1 init_params()  # Initialize parameters.
      2 optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
      3 
      4 for epoch in range(1, 21):
      5     loss = train()

/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     26         def decorate_context(*args, **kwargs):
     27             with self.__class__():
---> 28                 return func(*args, **kwargs)
     29         return cast(F, decorate_context)
     30 

/tmp/ipykernel_2445/1078389677.py in init_params()
     29     batch = next(iter(train_loader))
     30     batch = batch.to(device)
---> 31     model(batch.x_dict, batch.edge_index_dict)
     32 
     33 

/opt/conda/lib/python3.7/site-packages/torch/fx/graph_module.py in wrapped_call(self, *args, **kwargs)
    511                     print(generate_error_message(topmost_framesummary),
    512                           file=sys.stderr)
--> 513                 raise e.with_traceback(None)
    514 
    515         cls.__call__ = wrapped_call

AttributeError: 'NoneType' object has no attribute 'dim'

Any help will be much appreciated!

Issue Analytics

State:
Created 2 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

2reactions

rusty1scommented, Aug 14, 2022

Yes, I think this is related to missing edge types that point to node type B. I think this experience will be smoother in the upcoming release.

Within RandomLinkSplit, you can use None as the rev_edge_type for A<>A and C<>C.

In addition, I don‘t think you need to necessarily create a new model for predicting every edge type. You can potentially train this end-to-end together.

1reaction

rusty1scommented, Nov 30, 2021

All node types will need to have some features to enable message passing. In case they are not given, you can use torch.nn.Embedding to learn them, e.g.:

class MyModel(torch.nn.Module):
    def __init__(self, ...):
        self.emb = Embedding(1316, 64)

        self.model = to_hetero(model)
        
    def forward(self, x_dict, edge_index_dict):
        x_dict = copy.copy(x_dict)
        x_dict["areaNoFeature"] = self.emb.weight
        return self.model(x_dict, edge_index_dict)

Top Results From Across the Web

Heterogeneous Graph Learning - PyTorch Geometric

A large set of real-world datasets are stored as heterogeneous graphs, motivating the ... Deploy existing (or write your own) heterogeneous GNN operators....

Heterogeneous Graph Neural Network via Attribute Completion

Recently, many excellent models have been proposed to process hetero-graph data using GNNs and have achieved great success. These GNN-based heterogeneous models ...

Topic-aware Heterogeneous Graph Neural Network for Link ...

Experiments on real heterogeneous graph datasets demonstrate that our proposed model significantly outperforms state-of-the- art methods in link ...

Heterogeneous graph learning [Advanced PyTorch Geometric ...

We have discussed Heterogeneous Graphs Learning. In particular, we show how Heterogeneous Graphs in Pytorch Geometric are loaded and their ...

Heterogeneous Graph Neural Network - ACM Digital Library

Extensive experiments on several datasets demonstrate that HetGNN can outperform state-of-the-art baselines in various graph mining tasks, i.e., ...