How to report errors for container fields?
See original GitHub issueGenerally, a field name is associated a list of errors as strings. The exception to this is container fields. Those may return errors for the container and the contained field(s). In this case, rather than a list, a dict
is used where the keys are the field names or the list index (as string or as integer):
errors = {
'_schema': [...],
'field': [...],
'nested': {
# What about errors in the Nested field itself?
'_schema': [...],
'field': [...],
},
'nested_many': {
# What about errors in the Nested field itself?
'0': {
'_schema': [...],
'field': [...],
},
},
'list': {
# What about errors in the List field itself?
'0': [...],
}
'list_nested': {
# What about errors in the List field itself?
'0': {
# What about errors in the Nested field itself?
'_schema': [...],
'field': [...],
},
},
'dict': {
# What about errors in the Dict field itself?
'0': {
'key': [...],
'value': [...],
}
}
'dict_nested': {
# What about errors in the Dict field itself?
'0': {
'key': [...],
'value': {
# What about errors in the Nested field itself?
'_schema': [...],
'field': [...],
},
}
}
}
This has been working relatively fine for a while. It leaves a hole as you can’t return errors relative to the Nested
field itself, for instance, while returning errors for the fields in the nested schema.
It is generally not blocking, as the most frequent Nested
field level error is the missing
error, and when the field is missing, there is generally no nested error, so we return
nested: ['Missing data for required field.']
It is a shame that the structure (dict
or list
) depends on the error, but we could live with this.
Except we sometimes need to return errors for both the field and the nested schema, so we came up with this (https://github.com/marshmallow-code/marshmallow/pull/299):
'nested': {
'_field': [...],
'_schema': [...],
'field': [...],
},
where _field
stores errors about the field itself.
This is inconsistent, as I said in https://github.com/marshmallow-code/marshmallow/pull/299#issuecomment-401607834 and showed in https://github.com/marshmallow-code/marshmallow/pull/863 because the errors on the nested field itself may be reported in different places depending on whether there are errors in the nested schema. In other words, a dict
with a _field
attribute is used only if there are also errors in the nested schema, otherwise, the errors are returned as a strings, like for any field.
The same considerations apply to all container fields (List
, Dict
).
I can’t think of an ideal solution to allow reporting errors from both the container field and its nested schema/fields.
We’d like to keep the use of dict
for containers because it makes browsing the error structure convenient:
assert 'Some error.' in errors['nested']['field']
This rules out putting nested errors under a dedicated _nested
key.
On the other hand, putting errors on fields themselves in a dedicated _field
key for all fields seems a bit overkill as well.
A trade-off would be to use the _field
key for fields errors for all container fields and only for them, but do it consistently.
I suppose that was the intent of https://github.com/marshmallow-code/marshmallow/pull/299.
errors = {
'_schema': [...],
'field': ['Missing data for required field.',...],
'nested': {
'_field': ['Missing data for required field.', 'Other error from schema level validator.'],
},
}
This introduces a distinction between containers and other fields.
It looks bad in the usual case where fields are missing, as shown above, because containers have their “missing data” message embedded in a dict
.
Also, the implementation might be challenging: how do you know a field is a container when registering the error?
I’d like to keep it simple, but I’m out of ideas, so I’m appealing to collective wisdom.
I have the feeling that a good decision on this would avoid many corner case issues, so this should be set before piling-up patches.
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (11 by maintainers)
Top GitHub Comments
I’ve been thinking about this and I might have come up with something.
schema and field are more or less marshmallow-specific. They don’t necessarily make sense from client perspective.
Also, I don’t think there needs to be a difference between errors in a nested field (e.g. ‘Missing data for required field’ and errors in a nested field’s schema (‘Invalid data type’). So nested
_schema
errors can be merged with errors about theNested
field.My proposal is to keep an error
dict
structure to follow the hierarchy of theSchema
, and for each field with errors, store errors list under a special_errors
key.This is what I excluded at first when I wrote
but I see no alternative, and thinking again, it makes things more consistent: every value is a subdict, unless the key is
_errors
, in which case it is a list of errors, so thedict
is easier to parse as the type of each value is well defined.Here’s what it would look like:
I didn’t assess the implementation difficulty.
Thoughts anyone?
Well… almost, but not exactly.
Currently, @validates can only be called once per field (see https://github.com/marshmallow-code/marshmallow/issues/653). But this is a limitation in the decoration feature that we might want to overcome one day. And “several schema validators reporting multiple errors on the same field” is possible.
I’ve been spending hours on this and I don’t see any way to fix the “inconsistency” (field returning errors either as
list
or asdict
, same error being sometimes in thelist
, sometimes in thedict
(under the_field
or_schema
key), depending on other errors).However, I think #1026 is a great step towards a clearer interface and it ensures we’re not losing error messages in the process.
I believe this issue can be postponed to after 3.0. Most probably 4.0 as changing this would be a breaking change.