AttributeError: 'DistributedDataParallel' object has no attribute 'generate'
See original GitHub issue๐ Describe the bug
When I ran accelerate launch examples/ppo_sentiments.py
, the error below happened. Am I supposed to unwrap the ddp model?
AttributeError: 'DistributedDataParallel' object has no attribute 'generate'
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โ
โ /home/user/bob_workspace/code/trlx/examples/ppo_sentiments.py:38 in <module
โ 35 โ orch: PPOOrchestrator = get_orchestrator(cfg.train.orchestrator)( โ
โ 36 โ โ model, pipeline, reward_fn=reward_fn, chunk_size=cfg.method.chunk_size โ
โ 37 โ ) โ
โ โฑ 38 โ orch.make_experience(cfg.method.num_rollouts) โ
โ 39 โ model.learn() โ
โ 40 โ โ
โ 41 โ print("DONE!") โ
โ /home/user/bob_workspace/code/trlx/trlx/orchestrator/ppo_orchestrator.py:64 in โ
โ 63 โ โ โ [82/2259]
โ โฑ 64 โ โ โ query_tensors, response_tensors, response_text = self.rl_model.act(batc โ
โ 65 โ โ โ texts = [q + r for q, r in zip(batch.text, response_text)] โ
โ 66 โ โ โ scores = self.score(texts) โ
โ 67 โ
โ โ
โ /home/user/bob_workspace/code/trlx/trlx/model/accelerate_base_model.py:121 in act โ
โ โ
โ 118 โ โ โ โ self.dummy_input.to(self.accelerator.device) โ
โ 119 โ โ โ ) # Dummy pass to make things play nice with accelerate โ
โ 120 โ โ โ # Removed synced gpus โ
โ โฑ 121 โ โ โ response = self.model.generate( โ
โ 122 โ โ โ โ query_tensors, โ
โ 123 โ โ โ โ pad_token_id=self.tokenizer.eos_token_id, โ
โ 124 โ โ โ โ **self.config.method.gen_kwargs โ
โ โ
โ /opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py:1185 in __getattr__ โ
โ โ
โ 1182 โ โ โ modules = self.__dict__['_modules'] โ
โ 1183 โ โ โ if name in modules: โ
โ 1184 โ โ โ โ return modules[name] โ
โ โฑ 1185 โ โ raise AttributeError("'{}' object has no attribute '{}'".format( โ
โ 1186 โ โ โ type(self).__name__, name)) โ
โ 1187 โ โ
โ 1188 โ def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:
My accelerate config
- `Accelerate` version: 0.13.2
- Platform: Linux-5.4.0-107-generic-x86_64-with-glibc2.31
- Python version: 3.9.5
- Numpy version: 1.23.4
- PyTorch version (GPU?): 1.11.0 (True)
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: MULTI_GPU
- mixed_precision: no
- use_cpu: False
- num_processes: 8
- machine_rank: 0
- num_machines: 1
- gpu_ids: all
- main_process_ip: None
- main_process_port: None
- rdzv_backend: static
- same_network: True
- main_training_function: main
- deepspeed_config: {}
- fsdp_config: {}
- downcast_bf16: no
Which trlX version are you using?
trlx==1.0.0
Additional system and package information
No response
Issue Analytics
- State:
- Created a year ago
- Comments:10 (4 by maintainers)
Top Results From Across the Web
'DistributedDataParallel' object has no attribute 'generate ...
Steps: Loaded the trOCR model by huggingface: Coping the model to all GPUs; Predicting the validation set using generate function. Expectedย ...
Read more >How to reach model attributes wrapped by nn.DataParallel?
DataParallel I was able to reach it by model.rnn but after it raises AttributeError: 'DataParallel' object has no attribute 'rnn'.
Read more >'DistributedDataParallel' object has no attribute 'no_sync'
Hi, I am trying to fine-tune layoutLM using with the following: distribution = {'smdistributed':{'dataparallel':{ 'enabled': True }ย ...
Read more >'DistributedDataParallel' object has no attribute 'generate'(gpu ...
ๅจไฝฟ็จDistributedDataParallel่ฎญ็ปmodel็ๆถๅ๏ผๅ็ฐๅจ่ฟ่กforward็่ฟ็จไธญ๏ผไผ็ขฐๅฐDistributedDataParallel' object has no attribute็้ฎ้ขใ
Read more >'DataParallel' object has no attribute 'init_hidden'
When using DataParallel your original module will be in attribute module of the parallel module: for epoch in range(EPOCH_): hiddenย ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Weโre about to merge something that fixes this, give us a few hours. Thanks
Yes!