Tokenization_utils doesn't work with Pytorch-Lightning on 2.10.0 version
See original GitHub issue🐛 Bug
Information
Model I am using (Bert, XLNet …): Bert with pytorch-lightning
Language I am using the model on (English, Chinese …): English
The problem arises when using
- my own modified scripts: Pytorch Dataset with tokenizer inside
The tasks I am working on is:
- my own task or dataset
To reproduce
Take a look at this colab link.
I’ve copied the method from pytorch-lightning which shows an error on 2.10.0 transformers when process a batch. Doing the same with transformers 2.8.0 cause no error.
/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py in __getattr__(self, item)
201
202 def __getattr__(self, item: str):
--> 203 return self.data[item]
204
205 def keys(self):
KeyError: 'cuda'
Expected behavior
No error
transformers
version: 2.10.0- Platform: Linux
- Python version: 3.6.9
- PyTorch version (GPU?): 1.5.0+cu101
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
PyTorch Lightning Support? - Opacus
I'm trying to utilise opacus with the PyTorch Lightning framework ... e have not worked yet on integrating PyTorch Lightning with Opacus.
Read more >Introducing the Initial Release of PyTorch Lightning for ...
The new PyTorch Lightning integration enables developers to run any PyTorch models on the IPU with minimal code changes and optimal performance.
Read more >train Swin_T[pytorch lightning] - Kaggle
Explore and run machine learning code with Kaggle Notebooks | Using data from PetFinder.my - Pawpularity Contest.
Read more >PyTorch Lightning - Production
This release has a major new package inside lightning, a multi-GPU metrics package! ... There are two key facts about the metrics package...
Read more >Pytorch lightning logger doesn't work as expected
I am a beginner in Pytorch lightning and I am trying to implement a NN and plot the graph (loss and accuracy) on...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
BatchEncoding is indeed a UserDict, if you want to access the actual dict, you can use the data attribute:
Thank you! So it’s not a bug, but an expected behaviour