🐛 Trainer on TPU : KeyError '__getstate__'
See original GitHub issue🐛 Bug
Information
Model I am using : ELECTRA base
Language I am using the model on : English
The problem arises when using:
- the official example scripts
- my own modified scripts
The tasks I am working on is:
- an official GLUE/SQUaD task
- my own task or dataset
To reproduce
I’m trying to fine-tune a model on Colab TPU, using the new Trainer API. But I’m struggling.
Here is a self-contained Colab notebook to reproduce the error (it’s a dummy example).
When running the notebook, I get the following error :
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
return self.data[item]
KeyError: '__getstate__'
Full stack trace :
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/distributed/parallel_loader.py", line 172, in _worker
batch = xm.send_cpu_data_to_device(batch, device)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 624, in send_cpu_data_to_device
return ToXlaTensorArena(convert_fn, select_fn).transform(data)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 307, in transform
return self._replace_tensors(inputs)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 301, in _replace_tensors
convert_fn)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 199, in for_each_instance_rewrite
return _for_each_instance_rewrite(value, select_fn, fn, rwmap)
File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 179, in _for_each_instance_rewrite
result.append(_for_each_instance_rewrite(x, select_fn, fn, rwmap))
File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 187, in _for_each_instance_rewrite
result = copy.copy(value)
File "/usr/lib/python3.6/copy.py", line 96, in copy
rv = reductor(4)
File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
return self.data[item]
KeyError: '__getstate__'
Any hint on how to make this dummy example work is welcomed.
Environment info
transformers
version: 2.9.0- Platform: Linux-4.19.104±x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- PyTorch version (GPU?): 1.6.0a0+cf82011 (False)
- Tensorflow version (GPU?): 2.2.0 (False)
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
Top Results From Across the Web
Google Colab KeyError: 'COLAB_TPU_ADDR' - Stack Overflow
I'm trying to run a simple MNIST classifier on Google Colab using the TPU option. After creating the model using Keras, I am...
Read more >BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick
1. Setup. 1.1. Using Colab GPU for Training. Google Colab offers free GPUs and TPUs! Since we'll be training a large neural network ......
Read more >ray Changelog - pyup.io
- **Ray now works on Google Colab again!** The bug with memory limit fetching when running Ray in a container is now fixed...
Read more >See raw diff - Hugging Face
18075, "sType": 18076, "Ġae": 18077, "Ġdevelopment": 18078, "getstate": ... "Ġchance": 18469, "Resp": 18470, "changelog": 18471, "trainer": 18472, "Ġ'.
Read more >MIT was we will home can us about if page my has no
... print course job canada process teen room stock training too credit point ... himself component enable exercise bug santa mid guarantee leader...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Cc @mfuntowicz
Indeed, this should have been fixed in the versions
v3+
. Thanks for opening an issue @Colanim.