Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trainer's create_model_card creates an invalid yaml metadata `datasets: - null`

See original GitHub issue

Environment info

any env

Who can help

discussed with @julien-c @sgugger and @LysandreJik

Information

The hub will soon reject push with invalid model card metadata,
only when datasets, model-index or license are present, their content need to follow the specification cf. https://github.com/huggingface/huggingface_hub/pull/342

To reproduce

Steps to reproduce the behavior:

Train a model
Do not association any datasets
The trained model and the model card are rejected by the server

Expected behavior

trainer.py git push should be successfull, even with the coming patch https://github.com/huggingface/transformers/pull/13514

Issue Analytics

State:
Created 2 years ago
Comments:12 (9 by maintainers)

Top GitHub Comments

1reaction

sguggercommented, Sep 13, 2021

In the meantime I’ve suggested a fix for the problems for which this issue was created and for the incomplete results I mentioned. Both have their origin in the code of the TrainingSummary, so fixing them is not duplicate code 😃

We can think more about what validation we want to do where, personally I would see this more in the hf_hub side, in the function that adds metadata (which we will use in the Trainer once it’s merged and in a release of hf_hub).

1reaction

julien-ccommented, Sep 13, 2021

Hmm, the datasets: - null issue is not about missing datasets, it’s about invalid data.

in YAML, this is parsed as datasets = [None] (python-syntax) whereas it should be an array of string.

In my opinion, we will not enforce rejections for missing data any time soon (especially for automatically generated model cards).