Generate: deprecate the use of model `config` as a source of defaults
See original GitHub issueEDIT: Updated with the discussion up to 2022/08/20
Why?
A confusing part of generate
is how the defaults are set. When a certain argument is not specified, we attempt to fetch it from the model config
file. This makes generate
unpredictable and hard to fully document (the default values change for each model), as well as a major source of issues 🔪
How?
We have the following requirements:
1️⃣ The existing behavior can’t be removed, i.e., we must be able to use the model config.json
as a source of generation parameters by default;
2️⃣ We do need per-model defaults – some models are designed to do a certain thing (e.g. summarization), which requires a specific generation configuration.
3️⃣ Users must have full control over generate, with minimal hidden behavior.
Ideally, we also want to:
4️⃣ Have separation of concerns and use a new generate_config.json
to parameterize generation;
A TL;DR of the plan consists in changing the paradigm from “non-specified generate
arguments are overridden by the [model] configuration file” to “generate
arguments will override the [generate] configuration file, which is always used”. With proper documentation changes and logging/warnings, the user will be aware of what’s being set for generate
.
Step 1: Define a new generate config file and class
Similar to the model config, we want a .json
file to store the generation defaults. The class itself can be a very simplified version of PretrainedConfig
, also with functionality to load/store from the hub.
Step 2: Integrate loading generate config file in .from_pretrained()
The generation configuration file should be loaded when initializing the model with a from_pretrained()
method. A couple of things to keep in mind:
- There will be a new
kwarg
infrom_pretrained
,generate_config
(orgeneration_config
? Leaning toward the former as it has the same name as the function); - It will default to
generate_config.json
(contrarily to the modelconfig
, which defaults toNone
). This will allow users to set this argument toNone
, to load a model with an empty generate config. Some users have requested a feature like this; - Because the argument can take a path, it means that users can store/load multiple generate configs if they wish to do so (e.g. to use the same model for summarization, creative generation, factual question-answering, etc) 🚀
- Only models that can run
generate
will attempt to load it; - If there is no
generate_config.json
in the repo, it will attempt to initialize the generate configuration from the modelconfig.json
. This means that this solution will not change anygenerate
behavior and will NOT need a major release 👼 - To keep the user in the loop, log ALL parameters set when loading the generation config file. Something like the snippet below.
- Because this happens at
from_pretrained()
time, logging will only happen at most once and will not be verbose.
`facebook/opt-1.3b` generate configuration loaded from `generate_config.json`. The following generation defaults were set:
- max_length: 20
- foo: bar
- baz: qux
Step 3: Generate uses the generate config class internally
Instead of using the configuration to override arguments when they are not set, overwrite a copy of the generation config at generate
time. I.e. instead of:
arg = arg if arg is not None else self.config.arg
...
do
generate_config = self.generate_config.copy()
generate_config.arg = arg if arg is not None
...
This change has three main benefits:
- We can improve the readability of the code, as we gain the ability to pass configs around. E.g. this function won’t need to take a large list of arguments nor to bother with their initialization.
- Building
generate
argument validation for each type of generation can be built in simple functions that don’t need ~30 arguments as input 🙃 - The three frameworks (PT/TF/FLAX) can share functionality like argument validation, decreasing maintenance burden.
Step 4: Document and open PRs with the generation config file
Rewrite part of the documentation to explain that a generation config is ALWAYS used (regardless of having defaults loaded from the hub or not). Open Hub PRs to pull generate-specific parameters from config.json
to generate_config.json
Pros/Cons
Pros:
- Better awareness – any
generate
default will be logged to the screen when loading a generate-compatible model; - Full control – the users can choose NOT to load generation parameters or easily load a set of options from an arbitrary file;
- Enables more readable
generate
code; - Enables sharing
generate
-related code across frameworks; - Doesn’t need a major release.
Cons:
- Pulling the generate parameters into their own files won’t happen everywhere, as merging the changes described in step 4 is not feasible for all models (e.g. due to unresponsive model owners);
- Logging loaded defaults may not be enough to stop issues related to the default values, as the logs can be ignored;
- Another config file (and related code) to maintain.
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:10 (10 by maintainers)
Top GitHub Comments
The plan looks good to me, but the devil will be in the details 😉 Looking forward to the PRs actioning this!
@patrickvonplaten Agreed, the argument name is a bit too long 😅 However, if we decide to go the
GenerationMixin.__init__
route, we can’t pickconfig
–PreTrainedModel
, which inherits fromGenerationMixin
, uses aconfig
argument for the model config. Perhapsgeneration_config
? We could then do.from_pretrained(foo, generation_config=bar)
.I love the ideas you gave around the config:
__init__
and if we always attempt to load the new file format before falling back to the original config, it actually means we don’t need to do a major release to build the final version of this updated configuration handling! No need to change defaults with a new release at all ❤️ ;model.generate(top_k=top_k)
and thenmodel.generate(temperature=temperature)
,top_k
should be the original config’stop_k
. Copies of objects are needed;Regarding
dict
vsclass
– I’d go withclass
(or perhaps a simplerdataclass
). Much easier to document and enforce correctness, e.g. check if the right arguments are being used with a certain generation type.It seems like we are in agreement. Are there more issues we can anticipate?