question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Attributes explicitly defined in model configurations are now overridden by the default type.

See original GitHub issue

Environment info

  • transformers version: 4.11.0.dev0
  • Platform: Linux-5.14.11-arch1-1-x86_64-with-glibc2.33
  • Python version: 3.9.7
  • PyTorch version (GPU?): 1.9.1+cu102 (True)
  • Tensorflow version (GPU?): 2.6.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.3.5 (cpu)
  • Jax version: 0.2.21
  • JaxLib version: 0.1.71
  • Using GPU in script?: <fill in>
  • Using distributed or parallel set-up in script?: <fill in>

The issue

The issue is made visible from the introduction of parameter setters in https://github.com/huggingface/transformers/pull/13026.

This PR moved the initialization of the parent object to be the last statement of the configuration creation - while this could be benign, it isn’t due to the fact that some arguments are defined both in the model configuration and in the upstream configuration.

Such an example is the FSMT configuration. It defines the generate arguments here:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/models/fsmt/configuration_fsmt.py#L183-L185

At the end of the method, it initializes the parent configuration without passing the parameter:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/models/fsmt/configuration_fsmt.py#L199-L208

Finally, in the parent configuration, the num_beams is set once again:

https://github.com/huggingface/transformers/blob/408b2d2bd08f667cf4154730cc323c4e49657eed/src/transformers/configuration_utils.py#L264

This is an issue now as this overrides the previously set num_beams to be 1. The issue wasn’t caught before because the superclass initialization happened at the beginning, being overridden by the parameters afterwards. This is not the case anymore.

This makes the following test fail: tests/test_modeling_fsmt.py -k test_translation_direct_2_en_de.

IMO the issue comes from the redefinition of arguments in the FSMT configuration which should not be done as the superclass will already correctly define these arguments given the kwargs. The simplest patch for this (apart from making sure that the parameters are only set once) would be to make sure all previously applied parameters are taken into account by the superclass by adding the following statement to the initialization of the PretrainedConfig superclass:

[...]

    def __init__(self, **kwargs):
+        kwargs = {**kwargs, **self.__dict__}

        # Attributes with defaults
        self.return_dict = kwargs.pop("return_dict", True)
[...]

WDYT? cc @sgugger @stas00 @nreimers @patrickvonplaten

The cleanest solution would be to make sure that all parameters are only set once, however, which is slightly harder to test.

Reproducible code sample:

from transformers import AutoTokenizer, FSMTForConditionalGeneration

pair = "en-de"


text = {
    "en": "Machine learning is great, isn't it?",
    "ru": "Машинное обучение - это здорово, не так ли?",
    "de": "Maschinelles Lernen ist großartig, oder?",
}

src, tgt = pair.split("-")
print(f"Testing {src} -> {tgt}")
mname = f"facebook/wmt19-{pair}"

src_text = text[src]
tgt_text = text[tgt]

tokenizer = AutoTokenizer.from_pretrained(mname)
model = FSMTForConditionalGeneration.from_pretrained(mname)
print(model.config)
input_ids = tokenizer.encode(src_text, return_tensors="pt")

outputs = model.generate(input_ids)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
assert decoded == tgt_text, f"\n\ngot: {decoded}\nexp: {tgt_text}\n"

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
sguggercommented, Oct 13, 2021

I highly doubt the bug is only in FSMT @stas00 The fact that #13026 moved all the super calls at the end of the configuration init has probably created multiple instances of it. It’s just FSMT had good tests that showed us the bug 😃

0reactions
LysandreJikcommented, Oct 14, 2021

I like this proposition too @nreimers!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Model configurations | dbt Developer Hub
Apply configurations to one model only​​​ In these cases, placing configurations in the dbt_project. yml file can be unwieldy. Instead, you can ...
Read more >
Model Bulk Configuration - EF Core - Microsoft Learn
This sets the ConfigurationSource to DataAnnotation , which means that the value can now be overridden by explicit mapping on OnModelCreating , ...
Read more >
Model settings - Sails.js
Default attributes can also be overridden on a per-model basis by defining a replacement attribute with the same name in the relevant model...
Read more >
factory_bot/GETTING_STARTED.md at main - GitHub
Implicit definition; Explicit definition; Inline definition; Specifying the factory; Overriding attributes; Association overrides; Build strategies ...
Read more >
Define a dictionary override
For example, a dictionary override changes the default value of the priority field from 4 in the parent table to 5 in the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found