question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Import error when fine-tuning mbart from master branch

See original GitHub issue

Environment info

  • transformers version: 3.3.1
  • Platform: Linux-4.19.112±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.6.0+cu101 (True)
  • Tensorflow version (GPU?): 2.3.0 (True)
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

Who can help

@mfuntowicz @sshleifer

Information

Model I am using (mBART):

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below) My script involves fine-tuning mbart for multilingual-translation. Problem is arising when importing transformers from master branch.

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below) IITB Hindi-english dataset

To reproduce

Steps to reproduce the behavior:

  1. import transformers from master branch
 from transformers import (
  File "/content/drive/My Drive/hin-eng/transformers/__init__.py", line 68, in <module>
    from .data import (
  File "/content/drive/My Drive/hin-eng/transformers/data/__init__.py", line 6, in <module>
    from .processors import (
  File "/content/drive/My Drive/hin-eng/transformers/data/processors/__init__.py", line 6, in <module>
    from .squad import SquadExample, SquadFeatures, SquadV1Processor, SquadV2Processor, squad_convert_examples_to_features
  File "/content/drive/My Drive/hin-eng/transformers/data/processors/squad.py", line 10, in <module>
    from ...tokenization_bart import BartTokenizer
  File "/content/drive/My Drive/hin-eng/transformers/tokenization_bart.py", line 18, in <module>
    from .tokenization_roberta import RobertaTokenizer, RobertaTokenizerFast
  File "/content/drive/My Drive/hin-eng/transformers/tokenization_roberta.py", line 20, in <module>
    from .tokenization_gpt2 import GPT2Tokenizer, GPT2TokenizerFast
  File "/content/drive/My Drive/hin-eng/transformers/tokenization_gpt2.py", line 27, in <module>
    from .tokenization_utils_fast import PreTrainedTokenizerFast
  File "/content/drive/My Drive/hin-eng/transformers/tokenization_utils_fast.py", line 29, in <module>
    from .convert_slow_tokenizer import convert_slow_tokenizer
  File "/content/drive/My Drive/hin-eng/transformers/convert_slow_tokenizer.py", line 25, in <module>
    from tokenizers.models import BPE, Unigram, WordPiece
ImportError: cannot import name 'Unigram'

Expected behavior

Importing transformers normally.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
sshleifercommented, Oct 15, 2020

you need to upgrade tokenizers. It should happen for you if you run pip install -e ".[dev]" from the root of the repo.

0reactions
fulsicommented, Dec 4, 2020

Hello, Which directory are you talking about ? The transformer site package directory, the project directory ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Wav2vec fine-tuning with multiGPU - Hugging Face Forums
I start training my tuned in jupyter model with multiGPU with this script run_common_voice.py i see this error: RuntimeError: Expected to mark a ......
Read more >
Getting an error when importing mechanize branch for python3
I took this file from master branch (it needs to change print smth to print (smth) ). But I still don't get how...
Read more >
how to set max_split_size_mb pytorch - You.com | The search ...
We are trying to run HuggingFace Transformers Pipeline model in Paperspace (using its GPU). The problem is that when we set 'device=0' we...
Read more >
Available CRAN Packages By Name
antaresRead, Import, Manipulate and Explore the Results of an 'Antares' ... ArArRedux, Rigorous Data Reduction and Error Propagation of Ar40 / Ar39 Data....
Read more >
Measurement Module for Branch Circuit Power Monitor/Sub ...
(b) Click Update meter and monitor the error values on the reference meter. (c) If this measurement error (%) is not accurate enough,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found