Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Attributes explicitly defined in model configurations are now overridden by the default type.

See original GitHub issue

Environment info

transformers version: 4.11.0.dev0
Platform: Linux-5.14.11-arch1-1-x86_64-with-glibc2.33
Python version: 3.9.7
PyTorch version (GPU?): 1.9.1+cu102 (True)
Tensorflow version (GPU?): 2.6.0 (False)
Flax version (CPU?/GPU?/TPU?): 0.3.5 (cpu)
Jax version: 0.2.21
JaxLib version: 0.1.71
Using GPU in script?: <fill in>
Using distributed or parallel set-up in script?: <fill in>

The issue

The issue is made visible from the introduction of parameter setters in https://github.com/huggingface/transformers/pull/13026.

This PR moved the initialization of the parent object to be the last statement of the configuration creation - while this could be benign, it isn’t due to the fact that some arguments are defined both in the model configuration and in the upstream configuration.

Such an example is the FSMT configuration. It defines the generate arguments here:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/models/fsmt/configuration_fsmt.py#L183-L185

At the end of the method, it initializes the parent configuration without passing the parameter:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/models/fsmt/configuration_fsmt.py#L199-L208

Finally, in the parent configuration, the num_beams is set once again:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/configuration_utils.py#L264

This is an issue now as this overrides the previously set num_beams to be 1. The issue wasn’t caught before because the superclass initialization happened at the beginning, being overridden by the parameters afterwards. This is not the case anymore.

This makes the following test fail: tests/test_modeling_fsmt.py -k test_translation_direct_2_en_de.

IMO the issue comes from the redefinition of arguments in the FSMT configuration which should not be done as the superclass will already correctly define these arguments given the kwargs. The simplest patch for this (apart from making sure that the parameters are only set once) would be to make sure all previously applied parameters are taken into account by the superclass by adding the following statement to the initialization of the PretrainedConfig superclass:

[...]

    def __init__(self, **kwargs):
+        kwargs = {**kwargs, **self.__dict__}

        # Attributes with defaults
        self.return_dict = kwargs.pop("return_dict", True)
[...]

WDYT? cc @sgugger @stas00 @nreimers @patrickvonplaten

The cleanest solution would be to make sure that all parameters are only set once, however, which is slightly harder to test.

Reproducible code sample:

from transformers import AutoTokenizer, FSMTForConditionalGeneration

pair = "en-de"


text = {
    "en": "Machine learning is great, isn't it?",
    "ru": "Машинное обучение - это здорово, не так ли?",
    "de": "Maschinelles Lernen ist großartig, oder?",
}

src, tgt = pair.split("-")
print(f"Testing {src} -> {tgt}")
mname = f"facebook/wmt19-{pair}"

src_text = text[src]
tgt_text = text[tgt]

tokenizer = AutoTokenizer.from_pretrained(mname)
model = FSMTForConditionalGeneration.from_pretrained(mname)
print(model.config)
input_ids = tokenizer.encode(src_text, return_tensors="pt")

outputs = model.generate(input_ids)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
assert decoded == tgt_text, f"\n\ngot: {decoded}\nexp: {tgt_text}\n"

Issue Analytics

State:
Created 2 years ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

sguggercommented, Oct 13, 2021

I highly doubt the bug is only in FSMT @stas00 The fact that #13026 moved all the super calls at the end of the configuration init has probably created multiple instances of it. It’s just FSMT had good tests that showed us the bug 😃

0reactions

LysandreJikcommented, Oct 14, 2021

I like this proposition too @nreimers!

Top Results From Across the Web

Model configurations | dbt Developer Hub

Apply configurations to one model only In these cases, placing configurations in the dbt_project. yml file can be unwieldy. Instead, you can ...

Model Bulk Configuration - EF Core - Microsoft Learn

This sets the ConfigurationSource to DataAnnotation , which means that the value can now be overridden by explicit mapping on OnModelCreating , ...

Model settings - Sails.js

Default attributes can also be overridden on a per-model basis by defining a replacement attribute with the same name in the relevant model...

factory_bot/GETTING_STARTED.md at main - GitHub

Implicit definition; Explicit definition; Inline definition; Specifying the factory; Overriding attributes; Association overrides; Build strategies ...

Define a dictionary override

For example, a dictionary override changes the default value of the priority field from 4 in the parent table to 5 in the...