failed to reduce memory cost by using Dataloader
See original GitHub issue❓ Questions & Help
I have a big graph whose x is (153364,67) and I create dataset like this
class MyOwnDataset(Dataset):
def __init__(self, root='./', transform=None, pre_transform=None):
super(MyOwnDataset, self).__init__(root, transform, pre_transform)
@property
def raw_file_names(self):
return ['some_file_1', 'some_file_2']
@property
def processed_file_names(self):
return ['data.npz']
def __len__(self):
return len(self.processed_file_names)
def _download(self):
pass
def _process(self):
pass
def get(self, idx=None):
data = torch.load('data.npz')
return data
and use dataloader like this:
loader = DataLoader([data],batch_size=16)
.
But when I come to the code:
for batch in loader:
optimizer.zero_grad()
batch = batch.to(dev)
loss = loss_op(model(batch.x, batch.train_edge_index)[batch.train_mask], batch.y[batch.train_mask])
loss.backward()
optimizer.step()
batch.x
is still (153364,67) and caused the OOM error, I’m very confused with this bug and looking forward to your reply.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Common error messages in Data Loader - Salesforce Help
Cause 2: Batch size is set to higher number. Try reducing the batch size to 1 so that records are processed one by...
Read more >Bug in Dataloader · Issue #73153 · pytorch/pytorch - GitHub
I face a stochastic bug with the Dataloader. The traceback is as shown below. The error was observed at Epoch 18 of my...
Read more >Increase Heap Size for Data Loader Started from dataloader ...
-Xmsn Specify the initial size, in bytes, of the memory allocation pool. This value must be a multiple of 1024 greater than 1MB....
Read more >Pytorch. How does pin_memory work in Dataloader?
1 Answer 1 · You're not using non_blocking=True in your .cuda() calls. · You're not using multiprocessing in your DataLoader , which means...
Read more >DataLoaders Explained: Building a Multi-Process Data Loader ...
Dataset for Tensorflow. These structures leverage parallel processing and pre-fetching in order reduce data loading time as much as possible. In ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Awesome, You really are the most enthusiastic man I have ever seem in github!
Yes, you are right. We still need to convert some existing operators to be able to automatically support bipartite graph message passing (this is why we reimplement
SAGEConv
in the example). You can simply use theSAGEConv
operator defined in the example.Edit: The
SAGEConv
innn.conv
now also uses theMessagePassing
interface.