Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

'BertEncoder' object has no attribute 'gradient_checkpointing'

See original GitHub issue

Who can help

@LysandreJik

Information

The model I am using is Bert.

I get an error when I call the function test().

The function definition of ‘test’ is as follows:

def test():
    bert.eval()
    bert_outputs = []

    with torch.no_grad():
        for unw, data in enumerate(test_loader, 0):
            ids = data['ids'].to(device, dtype = torch.long)
            mask = data['mask'].to(device, dtype = torch.long)
            token_type_ids = data['token_type_ids'].to(device, dtype = torch.long)
            targets = data['targets'].to(device, dtype = torch.float)
            outputs = bert(ids, mask, token_type_ids)

            bert_outputs.extend(torch.sigmoid(outputs).cpu().detach().numpy().tolist())

    return bert_outputs

The call log is as follows:

AttributeError                            Traceback (most recent call last)
<ipython-input-51-833f1f639ea7> in <module>()
     18   test_loader = DataLoader(test_dataset, **bert_test_params)
     19 
---> 20   test_outputs = test()
     21 
     22   test_outputs = np.array(test_outputs)

<ipython-input-50-050b63b5247c> in test()
      9             token_type_ids = data['token_type_ids'].to(device, dtype = torch.long)
     10             targets = data['targets'].to(device, dtype = torch.float)
---> 11             outputs = bert(ids, mask, token_type_ids)
     12 
     13             bert_outputs.extend(torch.sigmoid(outputs).cpu().detach().numpy().tolist())

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

<ipython-input-45-1b7273ac2d08> in forward(self, ids, mask, token_type_ids, return_dict)
      7 
      8     def forward(self, ids, mask, token_type_ids, return_dict = False):
----> 9         unw, out_1 = self.layer1(ids, attention_mask = mask, token_type_ids = token_type_ids)[0], self.layer1(ids, attention_mask = mask, token_type_ids = token_type_ids)[1]
     10         out_2 = self.layer2(out_1)
     11         out_final = self.layer3(out_2)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/transformers/models/bert/modeling_bert.py in forward(self, input_ids, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
   1003             output_attentions=output_attentions,
   1004             output_hidden_states=output_hidden_states,
-> 1005             return_dict=return_dict,
   1006         )
   1007         sequence_output = encoder_outputs[0]

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/transformers/models/bert/modeling_bert.py in forward(self, hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, past_key_values, use_cache, output_attentions, output_hidden_states, return_dict)
    557             past_key_value = past_key_values[i] if past_key_values is not None else None
    558 
--> 559             if self.gradient_checkpointing and self.training:
    560 
    561                 if use_cache:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1129                 return modules[name]
   1130         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1131             type(self).__name__, name))
   1132 
   1133     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'BertEncoder' object has no attribute 'gradient_checkpointing'

I am trying to perform sentiment analysis on tweets. The same code worked well a few weeks back without giving any errors. However, now I get the error mentioned above. I tried searching online for possible fixes but could not find any for this specific problem.

Issue Analytics

State:
Created 2 years ago
Comments:12 (5 by maintainers)

Top GitHub Comments

1reaction

venkatesh-kulkarnicommented, Oct 8, 2021

Updating the tranformers version to 4.10.1 fixed it. Thanks a lot for your help @sgugger

0reactions

github-actions[bot]commented, Nov 12, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Top Results From Across the Web

'BertEncoder' object has no attribute 'gradient_checkpointing'

I'm getting a strange error that previously worked OK. I'm only trying to use a previously trained NLP model to predict a label....

'GPT2Model' object has no attribute 'gradient_checkpointing ...

This issue is found to be occurring only if the framework is run using venv or deployment frameworks like uWSGI or gunicorn.

Getting error while using transform · Issue #271 - GitHub

AttributeError : 'BertEncoder' object has no attribute 'gradient_checkpointing'.

[Notes] Gradient Checkpointing with BERT - Veritable Tech Blog

Overview. Gradient checkpointing is a technique that reduces the memory footprint during model training (From O(n) to O(sqrt(n)) in the OpenAI ...

Explore Gradient-Checkpointing in PyTorch - Qingyang's Log

This is a practical analysis of how Gradient-Checkpointing is implemented in Pytorch, and how to use it in Transformer models like BERT and...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

'BertEncoder' object has no attribute 'gradient_checkpointing'

Who can help

Information

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

AttributeError: 'SequenceClassifierOutput' object has no attribute 'pooler_output'

`from_pretrained` doesn't validate its kwargs