Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

KeyError: 'bigbird_pegasus'

See original GitHub issue

Environment info

transformers version: 4.6.0.dev0
Platform: Linux-5.10.25-linuxkit-x86_64-with-debian-10.1
Python version: 3.7.4
PyTorch version (GPU?): 1.8.1+cu102 (False)
Tensorflow version (GPU?): 2.4.1 (False)
Using GPU in script?: (False)
Using distributed or parallel set-up in script?: none

Who can help

@patrickvonplaten

Information

Model I am using (Bert, XLNet …): google/bigbird-pegasus-large-arxiv

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

mport os
import torch
from datasets import load_dataset
from transformers import pipeline
from transformers import AutoTokenizer, AutoModel

dataset = load_dataset("patrickvonplaten/scientific_papers_dummy", "arxiv",
    cache_dir=os.getenv("cache_dir", "../../models"))
paper = dataset["validation"]["article"][1]

tokenizer = AutoTokenizer.from_pretrained(
    'google/bigbird-pegasus-large-arxiv',
    cache_dir=os.getenv("cache_dir", "../../models"))
model = AutoModel.from_pretrained(
    'google/bigbird-pegasus-large-arxiv',
    cache_dir=os.getenv("cache_dir", "../../models"))

summarizer = pipeline(
    'summarization',
    model=model,
    tokenizer=tokenizer)

Steps to reproduce the behavior:

Run the provided script
output:

2021-05-10 17:11:53.523744: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-05-10 17:11:53.523858: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Reusing dataset scientific_papers (models/scientific_papers/arxiv/1.1.1/051d70b9811c81480cbf2a238b499f7713ba4e19acdaeeb92320007d68b6d098)
Traceback (most recent call last):
  File "src/bigbird/run.py", line 17, in <module>
    cache_dir=os.getenv("cache_dir", "../../models"))
  File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 398, in from_pretrained
    config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py", line 421, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
KeyError: 'bigbird_pegasus'

I have also tried this import

from transformers import BigBirdPegasusForConditionalGeneration, BigBirdPegasusTokenizer

as described in the docs here, but in this case I get another error:

    from transformers import BigBirdPegasusForConditionalGeneration, BigBirdPegasusTokenizer
ImportError: cannot import name 'BigBirdPegasusForConditionalGeneration' from 'transformers' (unknown location)

Expected behavior

no error

Issue Analytics

State:
Created 2 years ago
Comments:7 (5 by maintainers)

Top GitHub Comments

1reaction

JiXiang-Zhoucommented, May 7, 2022

Hey @loretoparisi,

It’s working perfectly for me when running this:

pip3 uninstall transformers
pip3 install git+https://github.com/huggingface/transformers@master

from transformers import BigBirdPegasusForConditionalGeneration, AutoTokenizer, AutoModelForSeq2SeqLM
model = AutoModelForSeq2SeqLM.from_pretrained("google/bigbird-pegasus-large-arxiv")
# or
model = BigBirdPegasusForConditionalGeneration.from_pretrained("google/bigbird-pegasus-large-arxiv")

tokenizer = AutoTokenizer.from_pretrained("google/bigbird-pegasus-large-arxiv")

BigBird pegasus is not having BigBirdPegasusTokenizer so use AutoTokenizer only.

I have the same problem I tried the code pip3 uninstall transformers pip3 install git+https://github.com/huggingface/transformers@master and then get

WARNING: Did not find branch or tag 'master', assuming revision or ref.
Running command git checkout -q master
error: pathspec 'master' did not match any file(s) known to git
error: subprocess-exited-with-error

× git checkout -q master did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git checkout -q master did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

After running the code, the problem is still there

1reaction

patrickvonplatencommented, May 11, 2021

@loretoparisi, you are using a BigBird Roberta model as the model and BigBird Pegagus as the tokenizer -> those are two different checkpoints.

Also, it would be very nice if you could use the forum for “How to do …” questions as we try to keep the github issues for actual issues with the models. Thank you 😃

Top Results From Across the Web

ELECTRA - Hugging Face

ELECTRA is a new pretraining approach which trains two transformer models: the generator and the discriminator. The generator's role is to replace tokens...

Newest 'summarization' Questions - Page 2 - Stack Overflow

I am trying to use Big Bird Pegasus to summarize various long texts. The output is repeating the same concept in each sentence....

Python KeyError: name - YouTube

Python KeyError : name. 2.9K views 7 years ago. ATOM. ATOM. 6.36K subscribers. Subscribe. 2. I like this. I dislike this.

Trying To Add A Submit Button Using Jquery But Only On The ...

Auto Text Summarization · Key Error While Fine Tunning T5 For Summarization ... Big Bird Pegasus Summarization Output Is Repeating Itself ...

KeyError: 'descendants' chapter_17_python_crash_learning

Hi all, I was trying to process a code below from the book 'Python Crash Learning' but it came back with the following...