question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

While fine-tuning BERT with the new script I am facing the issue as follows:

Traceback (most recent call last):
  File "run_mlm.py", line 310, in <module>
    main()
  File "run_mlm.py", line 259, in main
    load_from_cache_file=not data_args.overwrite_cache,
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/dataset_dict.py", line 300, in map
    for k, dataset in self.items()
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/dataset_dict.py", line 300, in <dictcomp>
    for k, dataset in self.items()
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/arrow_dataset.py", line 1256, in map
    update_data=update_data,
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/arrow_dataset.py", line 156, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/fingerprint.py", line 158, in wrapper
    self._fingerprint, transform, kwargs_for_fingerprint
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/fingerprint.py", line 105, in update_fingerprint
    hasher.update(transform_args[key])
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/fingerprint.py", line 57, in update
    self.m.update(self.hash(value).encode("utf-8"))
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/fingerprint.py", line 53, in hash
    return cls.hash_default(value)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/fingerprint.py", line 46, in hash_default
    return cls.hash_bytes(dumps(value))
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/utils/py_utils.py", line 367, in dumps
    dump(obj, file)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/datasets/utils/py_utils.py", line 339, in dump
    Pickler(file, recurse=True).dump(obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 446, in dump
    StockPickler.dump(self, obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 409, in dump
    self.save(obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 1438, in save_function
    obj.__dict__, fkwdefaults), obj=obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 610, in save_reduce
    save(args)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 751, in save_tuple
    save(element)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 736, in save_tuple
    save(element)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 1170, in save_cell
    pickler.save_reduce(_create_cell, (f,), obj=obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 610, in save_reduce
    save(args)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 736, in save_tuple
    save(element)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 521, in save
    self.save_reduce(obj=obj, *rv)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 605, in save_reduce
    save(cls)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 1365, in save_type
    obj.__bases__, _dict), obj=obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 610, in save_reduce
    save(args)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 751, in save_tuple
    save(element)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 933, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 847, in _batch_setitems
    save(v)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/site-packages/dill/_dill.py", line 933, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 847, in _batch_setitems
    save(v)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 507, in save
    self.save_global(obj, rv)
  File "/home/ai-students/anaconda3/envs/env_nesara/lib/python3.6/pickle.py", line 927, in save_global
    (obj, module_name, name))
_pickle.PicklingError: Can't pickle typing.Union[str, NoneType]: it's not the same object as typing.Union


I am trying to run the same script with the already mentioned wikitext dataset. However, I am not able to run it successfully due to the above mentioned error.

@sgugger Could you please help me resolve this error?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:17 (8 by maintainers)

github_iconTop GitHub Comments

6reactions
sguggercommented, Nov 6, 2020

Further reduced, the bug appears in all python versions <= 3.6.12 but disappears in python 3.7.0.

2reactions
sguggercommented, Nov 6, 2020

For future reference, here is how I create an env reproducing the bug, and the command that shows it (self-contained to the repo):

pyenv install 3.6.7
pyenv virtualenv 3.6.7 picklebug
pyenv activate picklebug
pip install --upgrade pip
pip install transformers[torch]
pip install datasets
cd git/transformers # Adapt to your local path to the cloned repo
pip install -e .
python examples/language-modeling/run_mlm.py \
--model_name_or_path roberta-base \
--train_file ./tests/fixtures/sample_text.txt \
--validation_file ./tests/fixtures/sample_text.txt \
--do_train \
--do_eval \
--output_dir /tmp/test=clm \
--line_by_line
Read more comments on GitHub >

github_iconTop Results From Across the Web

Python multiprocessing PicklingError: Can't pickle <type ...
Here is a list of what can be pickled. In particular, functions are only picklable if they are defined at the top-level of...
Read more >
pickle — Python object serialization — Python 3.11.1 ...
Error raised when an unpicklable object is encountered by Pickler . It inherits PickleError . Refer to What can be pickled and unpickled?...
Read more >
Multiprocessing and Pickle, How to Easily fix that?
How to serialize an object using both pickle and dill packages. ... open the pickle file with an error-free — image by Author....
Read more >
Python “multiprocessing” “Can't pickle…” - TedChen - Medium
It's easy to know that the 'subprocess_function' can't be pickled because it's a local object inside the decorator wrapper function. But why was...
Read more >
pickle error in multiprocssing - pydata
When define the multiprocessing funtion inside the class , I got the error like Can't pickle when using multiprocessing Pool. map() .
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found