question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tokenization_utils doesn't work with Pytorch-Lightning on 2.10.0 version

See original GitHub issue

🐛 Bug

Information

Model I am using (Bert, XLNet …): Bert with pytorch-lightning

Language I am using the model on (English, Chinese …): English

The problem arises when using

  • my own modified scripts: Pytorch Dataset with tokenizer inside

The tasks I am working on is:

  • my own task or dataset

To reproduce

Take a look at this colab link.

I’ve copied the method from pytorch-lightning which shows an error on 2.10.0 transformers when process a batch. Doing the same with transformers 2.8.0 cause no error.

/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py in __getattr__(self, item)
    201 
    202     def __getattr__(self, item: str):
--> 203         return self.data[item]
    204 
    205     def keys(self):

KeyError: 'cuda'

Expected behavior

No error

  • transformers version: 2.10.0
  • Platform: Linux
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.5.0+cu101
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
mfuntowiczcommented, May 29, 2020

BatchEncoding is indeed a UserDict, if you want to access the actual dict, you can use the data attribute:

be = tokenizer.batch_encode_plus(...)
be.data
0reactions
sirilycommented, May 29, 2020

Thank you! So it’s not a bug, but an expected behaviour

Read more comments on GitHub >

github_iconTop Results From Across the Web

PyTorch Lightning Support? - Opacus
I'm trying to utilise opacus with the PyTorch Lightning framework ... e have not worked yet on integrating PyTorch Lightning with Opacus.
Read more >
Introducing the Initial Release of PyTorch Lightning for ...
The new PyTorch Lightning integration enables developers to run any PyTorch models on the IPU with minimal code changes and optimal performance.
Read more >
train Swin_T[pytorch lightning] - Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from PetFinder.my - Pawpularity Contest.
Read more >
PyTorch Lightning - Production
This release has a major new package inside lightning, a multi-GPU metrics package! ... There are two key facts about the metrics package...
Read more >
Pytorch lightning logger doesn't work as expected
I am a beginner in Pytorch lightning and I am trying to implement a NN and plot the graph (loss and accuracy) on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found