Abnormal memory usage of SageConv and GraphConv
See original GitHub issueI am trying to run the following 5 algorithms on my custom dataset (one single graph):
- GCNConv
- SAGEConv
- GATConv
- GraphConv
- HyperGraphConv
In all cases the task is node classification.
Three of them run perfectly fine, but when I replace the layer of my network with either SAGEConv or GraphConv, I get a memory error saying I am trying to allocate 667946000000 bytes to memory.
Here is my small network:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
#torch.manual_seed(12345)
#self.conv1 = GCNConv(dataset_num_features, 16)
#self.conv1 = SAGEConv(dataset_num_features, 16) # Does NOT work.
#self.conv1 = GATConv(dataset_num_features, 16, 10, concat=False)
self.conv1 = GraphConv(dataset_num_features, 16, aggr='mean') # Does NOT work.
#self.conv1 = HypergraphConv(dataset_num_features, 16, use_attention=True, heads=5, concat=False)
self.lin = nn.Linear(hidden_channels, dataset_num_classes)
def forward(self, data):
x, edge_index = data.x, data.edge_index
x = self.conv1(x, edge_index)
x = x.relu()
#x = global_mean_pool(x, batch)
x = F.dropout(x, p=0.8, training=self.training)
x = self.lin(x)
return x
and here is the error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-c43cd0da8520> in <module>
9 data.to(device)
10 optimizer.zero_grad()
---> 11 out = model(data)
12 loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask].squeeze())
13 loss.backward()
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
<ipython-input-6-b86f676abb44> in forward(self, data)
18 x, edge_index, edge_attr = data.x, data.edge_index, data.edge_attr
19
---> 20 x = self.conv1(x, edge_index)#, edge_weight=edge_attr.squeeze())
21 x = x.relu()
22 #x = global_mean_pool(x, batch)
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
530 result = self._slow_forward(*input, **kwargs)
531 else:
--> 532 result = self.forward(*input, **kwargs)
533 for hook in self._forward_hooks.values():
534 hook_result = hook(self, input, result)
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch_geometric/nn/conv/graph_conv.py in forward(self, x, edge_index, edge_weight, size)
60
61 # propagate_type: (x: OptPairTensor, edge_weight: OptTensor)
---> 62 out = self.propagate(edge_index, x=x, edge_weight=edge_weight,
63 size=size)
64 out = self.lin_l(out)
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py in propagate(self, edge_index, size, **kwargs)
231 # Otherwise, run both functions in separation.
232 elif isinstance(edge_index, Tensor) or not self.fuse:
--> 233 coll_dict = self.__collect__(self.__user_args__, edge_index, size,
234 kwargs)
235
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py in __collect__(self, args, edge_index, size, kwargs)
155 if isinstance(data, Tensor):
156 self.__set_size__(size, dim, data)
--> 157 data = self.__lift__(data, edge_index,
158 j if arg[-2:] == '_j' else i)
159
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py in __lift__(self, src, edge_index, dim)
125 if isinstance(edge_index, Tensor):
126 index = edge_index[dim]
--> 127 return src.index_select(self.node_dim, index)
128 elif isinstance(edge_index, SparseTensor):
129 if dim == 1:
RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 667946000000 bytes. Error code 12 (Cannot allocate memory)
Is there any reason why these two layer-types use so much memory? Is there something I can do to solve this problem?
EDIT : In the case of SAGEConv, I managed to run it by using the SparseTensor functionality. However it seems GraphConv does not work with SparseTensors.
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (8 by maintainers)
Top Results From Across the Web
Abnormal memory usage of SageConv and GraphConv #2074
I am trying to run the following 5 algorithms on my custom dataset (one single graph): - GCNConv - SAGEConv - GATConv -...
Read more >torch_geometric.nn — pytorch_geometric documentation
Feature decomposition reduces the peak memory usage by slicing the feature dimensions into separated feature decomposition layers during GNN aggregation.
Read more >Memory Leak (and Growth) Flame Graphs - Brendan Gregg
On this page page I'll summarize four tracing approaches I use for analyzing memory growths and leaks on an already running application. These ......
Read more >Hands-on Graph Neural Networks ... - Towards Data Science
Let's use the following graph to demonstrate how to create a Data ... I changed the GraphConv layer with our self-implemented SAGEConv layer ......
Read more >Hands-on Graph Neural Networks with PyTorch & PyTorch ...
Let's use the following graph to demonstrate how to create a Data ... one is for data that fit in your RAM, while...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, I understand
project
isFalse
by default. We can conduct dimensionality reduction inproject
, and the users can decide to use it or not. Then, the OOM issue is solved.You can close this issue now. Thanks for you nice work on this wonderful GNN library.
Note that
project
isFalse
by default. However, you are right, the first transformation will not do a dimensionality reduction.