question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

🐛 Trainer on TPU : KeyError '__getstate__'

See original GitHub issue

🐛 Bug

Information

Model I am using : ELECTRA base

Language I am using the model on : English

The problem arises when using:

  • the official example scripts
  • my own modified scripts

The tasks I am working on is:

  • an official GLUE/SQUaD task
  • my own task or dataset

To reproduce

I’m trying to fine-tune a model on Colab TPU, using the new Trainer API. But I’m struggling.

Here is a self-contained Colab notebook to reproduce the error (it’s a dummy example).

When running the notebook, I get the following error :

File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
    return self.data[item]
KeyError: '__getstate__'

Full stack trace :

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/distributed/parallel_loader.py", line 172, in _worker
    batch = xm.send_cpu_data_to_device(batch, device)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 624, in send_cpu_data_to_device
    return ToXlaTensorArena(convert_fn, select_fn).transform(data)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 307, in transform
    return self._replace_tensors(inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/core/xla_model.py", line 301, in _replace_tensors
    convert_fn)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 199, in for_each_instance_rewrite
    return _for_each_instance_rewrite(value, select_fn, fn, rwmap)
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 179, in _for_each_instance_rewrite
    result.append(_for_each_instance_rewrite(x, select_fn, fn, rwmap))
  File "/usr/local/lib/python3.6/dist-packages/torch_xla/utils/utils.py", line 187, in _for_each_instance_rewrite
    result = copy.copy(value)
  File "/usr/lib/python3.6/copy.py", line 96, in copy
    rv = reductor(4)
  File "/usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils.py", line 199, in __getattr__
    return self.data[item]
KeyError: '__getstate__'

Any hint on how to make this dummy example work is welcomed.

Environment info

  • transformers version: 2.9.0
  • Platform: Linux-4.19.104±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.6.0a0+cf82011 (False)
  • Tensorflow version (GPU?): 2.2.0 (False)
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

@jysohn23 @julien-c

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
julien-ccommented, May 14, 2020
0reactions
LysandreJikcommented, Jul 28, 2020

Indeed, this should have been fixed in the versions v3+. Thanks for opening an issue @Colanim.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Google Colab KeyError: 'COLAB_TPU_ADDR' - Stack Overflow
I'm trying to run a simple MNIST classifier on Google Colab using the TPU option. After creating the model using Keras, I am...
Read more >
BERT Fine-Tuning Tutorial with PyTorch - Chris McCormick
1. Setup. 1.1. Using Colab GPU for Training. Google Colab offers free GPUs and TPUs! Since we'll be training a large neural network ......
Read more >
ray Changelog - pyup.io
- **Ray now works on Google Colab again!** The bug with memory limit fetching when running Ray in a container is now fixed...
Read more >
See raw diff - Hugging Face
18075, "sType": 18076, "Ġae": 18077, "Ġdevelopment": 18078, "getstate": ... "Ġchance": 18469, "Resp": 18470, "changelog": 18471, "trainer": 18472, "Ġ'.
Read more >
MIT was we will home can us about if page my has no
... print course job canada process teen room stock training too credit point ... himself component enable exercise bug santa mid guarantee leader...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found