question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'bigbird_pegasus'

See original GitHub issue

Environment info

  • transformers version: 4.6.0.dev0
  • Platform: Linux-5.10.25-linuxkit-x86_64-with-debian-10.1
  • Python version: 3.7.4
  • PyTorch version (GPU?): 1.8.1+cu102 (False)
  • Tensorflow version (GPU?): 2.4.1 (False)
  • Using GPU in script?: (False)
  • Using distributed or parallel set-up in script?: none

Who can help

@patrickvonplaten

Information

Model I am using (Bert, XLNet …): google/bigbird-pegasus-large-arxiv

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

mport os
import torch
from datasets import load_dataset
from transformers import pipeline
from transformers import AutoTokenizer, AutoModel

dataset = load_dataset("patrickvonplaten/scientific_papers_dummy", "arxiv",
    cache_dir=os.getenv("cache_dir", "../../models"))
paper = dataset["validation"]["article"][1]

tokenizer = AutoTokenizer.from_pretrained(
    'google/bigbird-pegasus-large-arxiv',
    cache_dir=os.getenv("cache_dir", "../../models"))
model = AutoModel.from_pretrained(
    'google/bigbird-pegasus-large-arxiv',
    cache_dir=os.getenv("cache_dir", "../../models"))

summarizer = pipeline(
    'summarization',
    model=model,
    tokenizer=tokenizer)

Steps to reproduce the behavior:

  1. Run the provided script
  2. output:
2021-05-10 17:11:53.523744: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-05-10 17:11:53.523858: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Reusing dataset scientific_papers (models/scientific_papers/arxiv/1.1.1/051d70b9811c81480cbf2a238b499f7713ba4e19acdaeeb92320007d68b6d098)
Traceback (most recent call last):
  File "src/bigbird/run.py", line 17, in <module>
    cache_dir=os.getenv("cache_dir", "../../models"))
  File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 398, in from_pretrained
    config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 421, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
KeyError: 'bigbird_pegasus'

I have also tried this import

from transformers import BigBirdPegasusForConditionalGeneration, BigBirdPegasusTokenizer

as described in the docs here, but in this case I get another error:

    from transformers import BigBirdPegasusForConditionalGeneration, BigBirdPegasusTokenizer
ImportError: cannot import name 'BigBirdPegasusForConditionalGeneration' from 'transformers' (unknown location)

Expected behavior

no error

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
JiXiang-Zhoucommented, May 7, 2022

Hey @loretoparisi,

It’s working perfectly for me when running this:

pip3 uninstall transformers
pip3 install git+https://github.com/huggingface/transformers@master
from transformers import BigBirdPegasusForConditionalGeneration, AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained("google/bigbird-pegasus-large-arxiv")
# or
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-arxiv")

tokenizer = AutoTokenizer.from_pretrained("google/bigbird-pegasus-large-arxiv")

BigBird pegasus is not having BigBirdPegasusTokenizer so use AutoTokenizer only.

I have the same problem I tried the code pip3 uninstall transformers pip3 install git+https://github.com/huggingface/transformers@master and then get

WARNING: Did not find branch or tag 'master', assuming revision or ref.
Running command git checkout -q master
error: pathspec 'master' did not match any file(s) known to git
error: subprocess-exited-with-error

× git checkout -q master did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git checkout -q master did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

After running the code, the problem is still there

1reaction
patrickvonplatencommented, May 11, 2021

@loretoparisi, you are using a BigBird Roberta model as the model and BigBird Pegagus as the tokenizer -> those are two different checkpoints.

Also, it would be very nice if you could use the forum for “How to do …” questions as we try to keep the github issues for actual issues with the models. Thank you 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

ELECTRA - Hugging Face
ELECTRA is a new pretraining approach which trains two transformer models: the generator and the discriminator. The generator's role is to replace tokens...
Read more >
Newest 'summarization' Questions - Page 2 - Stack Overflow
I am trying to use Big Bird Pegasus to summarize various long texts. The output is repeating the same concept in each sentence....
Read more >
Python KeyError: name - YouTube
Python KeyError : name. 2.9K views 7 years ago. ATOM. ATOM. 6.36K subscribers. Subscribe. 2. I like this. I dislike this.
Read more >
Trying To Add A Submit Button Using Jquery But Only On The ...
Auto Text Summarization · Key Error While Fine Tunning T5 For Summarization ... Big Bird Pegasus Summarization Output Is Repeating Itself ...
Read more >
KeyError: 'descendants' chapter_17_python_crash_learning
Hi all, I was trying to process a code below from the book 'Python Crash Learning' but it came back with the following...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found