Global Accelerator.RNG_types is passed and modified when preparing
See original GitHub issueHi,
Correct me if I’m wrong but:
The seed for sampler generator is currently set with generator.manual_seed(int(torch.empty((), dtype=torch.int64).random_().item()))
. Ref
To my understanding without setting initial seed, the sampler generator is different across GPUs and should be synchronized at every step (or at least the first time calling iter()
). This is true provided the rng_types contains [‘generator’] (when called the first time).
However, when preparing dataloader, the global accelerator.rng_types
is passed around and then could be modified (remove ‘generator’ from it) (ref). This occurs when preparing dataloader (that doesn’t shuffle) (eval loader).
So I thought after calling
train_loader, eval_loader = accelerator.prepare(train_loader, eval_loader)
the rng_state
is now empty list, and train_loader sampler generator is not synchronized.
Should this be an issue? if not what is the logic.
Note: If same seed is required at for all process, then dropout wouldn’t operate independently across GPUs (ref)
Example code:
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
from torch.utils.data import Dataset, DataLoader
import random
from accelerate import Accelerator
def set_seed(seed):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
class CustomDataset(Dataset):
def __init__(self, size=10) -> None:
super().__init__()
self.size=10
def __getitem__(self, i):
return torch.Tensor([i]), torch.Tensor([1, i , 2])
def __len__(self):
return self.size
class CustomModel(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(1, 3)
self.dropout = nn.Dropout(p=0.5)
def forward(self, x):
return self.dropout(self.linear1(x))
def main():
accelerator = Accelerator()
set_seed(66+accelerator.process_index)
device = torch.device('cuda:7')
dataset = CustomDataset(size=10)
dataloader = DataLoader(dataset, batch_size=5, num_workers=0, shuffle=True)
eval_dataloader = DataLoader(dataset, batch_size=5, num_workers=0, shuffle=False)
model = CustomModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.5)
model, dataloader, optimizer, eval_dataloader = accelerator.prepare(model, dataloader, optimizer, eval_dataloader)
model.train()
for input_, output_ in dataloader:
res = model(input_)
print(f'{accelerator.process_index}, {input_}')
if __name__ == '__main__':
main()
-----
Output:
1, tensor([[2.],
[8.],
[4.],
[7.],
[6.]], device='cuda:1')
0, tensor([[9.],
[8.],
[0.],
[2.],
[3.]], device='cuda:0')
Thanks.
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
No, it is, as the name indicates, a
BatchSampler
. It only computes indices (specifically the indices of the elements we want in each batch), the access to the Dataset is done later inside theDataLoader
.This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.