Wrong message on type validation errors
See original GitHub issueBefore going too deep
I’m using marshmallow==2.19.5 and I noticed that there’s a release candidate for marshmallow 3 so if this issue has somehow been solved, please ignore the rest. However, the only issue I noticed which seemed like it might have some relevance is this: https://github.com/marshmallow-code/marshmallow/issues/852, but the PR for this doesn’t seem to address the problem that I’m having. (PR: https://github.com/marshmallow-code/marshmallow/pull/1067/files).
Reproducing the issue
Let’s consider the following code with 3 schema definitions:
from marshmallow import Schema, fields
class TwoLevelsNestedResource(Schema):
some_irrelevant_field = fields.String(required=True)
class OneLevelNestedResource(Schema):
field_a = fields.String(required=True)
field_b = fields.Integer(required=True)
field_c = fields.Nested(TwoLevelsNestedResource, required=True)
class MainResource(Schema):
main_field_a = fields.String(required=True)
main_field_b = fields.Integer(required=True)
nested_field = fields.Nested(OneLevelNestedResource, required=True, many=False)
Running the following function would result in the data being validated without any errors which is what we want:
def load_correct_data():
print(MainResource().load({
'main_field_a': '',
'main_field_b': 0,
'nested_field': {
'field_a': '',
'field_b': 0,
'field_c': {
'some_irrelevant_field': ''
}
}
}).errors)
load_correct_data()
Running the next function should print a validation error because the data which is inputted is not correct.
def load_wrong_data_a():
print(MainResource().load('not what we want').errors)
load_wrong_data_a()
As the root level object we should send a dictionary, not a string. Therefore, we should get a type validation error. The problem is that we should get whatever is defined as the default error message for the fields.Nested
type which is "Invalid type."
. However, any of the following 2 error messages might be returned at different calls of the same function.
{'_schema': ['Invalid input type.']}
{'_schema': ['Invalid type.']}
Furthermore, the following code has the same issue, one level deeper in the schema:
def load_wrong_data_b():
print(MainResource().load(dict(
main_field_a='',
main_field_b=0,
nested_field='again, not what we want',
)).errors)
load_wrong_data_b()
The two possible outputs, alternatively are:
{'nested_field': {'_schema': ['Invalid input type.']}}
{'nested_field': {'_schema': ['Invalid type.']}}
Again, just like in the previous case, since nested_field
expects a nested object, a dict, the error we’re expecting here is "Invalid type."
. Unfortunately "Invalid input type."
will be returned from time to time.
Possible causes
As far as I could tell from debugging the code in marshmallow.marshalling, for each schema definition we’re going through the list of defined fields (https://github.com/marshmallow-code/marshmallow/blob/2.x-line/src/marshmallow/marshalling.py#L248) and then attempt to fetch the value of that field (https://github.com/marshmallow-code/marshmallow/blob/2.x-line/src/marshmallow/marshalling.py#L252).
In case data
is a string instead of a dict like in the situation I presented above, an AttributeError
will be raised and then an error for the type validation will be added (https://github.com/marshmallow-code/marshmallow/blob/2.x-line/src/marshmallow/marshalling.py#L255-L257).
The problem here seems to be the fact that we’re in fact using .error_messages['type']
on the field_obj
we’re actually using first from the list of fields defined in a schema. This means that we’re not actually looking at the error messages defined for the field (or schema root) which is should be a dict in case of fields.Nested
but instead at the error messages for whatever child field which is picked up first by the iterator.
This can be easily observed if we alter the initial schema definitions like this:
class TwoLevelsNestedResource(Schema):
some_irrelevant_field = fields.String(required=True)
class OneLevelNestedResource(Schema):
field_a = fields.String(required=True, error_messages={
'type': 'field_a must be string'
})
field_b = fields.Integer(required=True, error_messages={
'type': 'field_b must be int'
})
field_c = fields.Nested(TwoLevelsNestedResource, required=True, error_messages={
'type': 'field_c must be dict'
})
class MainResource(Schema):
main_field_a = fields.String(required=True, error_messages={
'type': 'main_field_a must be string'
})
main_field_b = fields.Integer(required=True, error_messages={
'type': 'main_field_b must be int'
})
nested_field = fields.Nested(OneLevelNestedResource, required=True, many=False, error_messages={
'type': 'nested_field must be dict'
})
Then, when running load_wrong_data_a()
, we can get one of the following:
{'_schema': ['nested_field must be dict']}
{'_schema': ['main_field_b must be int']}
{'_schema': ['main_field_a must be string']}
As we can see, all of these are actually custom error messages for the fields of the root schema definition.
When running load_wrong_data_b()
we can confirm this behaviour:
{'nested_field': {'_schema': ['field_a must be string']}}
{'nested_field': {'_schema': ['field_b must be int']}}
{'nested_field': {'_schema': ['field_c must be dict']}}
Workaround
Until this issue is fixed, configuring the schema to preserve the order of the fields helps with getting a consistent error message, but the underlying issue of the message being wrong is still there.
from marshmallow import Schema, fields
class TwoLevelsNestedResource(Schema):
some_irrelevant_field = fields.String(required=True)
class OneLevelNestedResource(Schema):
class Meta:
ordered = True
field_a = fields.String(required=True, error_messages={
'type': 'field_a must be string'
})
field_b = fields.Integer(required=True, error_messages={
'type': 'field_b must be int'
})
field_c = fields.Nested(TwoLevelsNestedResource, required=True, error_messages={
'type': 'field_c must be dict'
})
class MainResource(Schema):
class Meta:
ordered = True
main_field_a = fields.String(required=True, error_messages={
'type': 'main_field_a must be string'
})
main_field_b = fields.Integer(required=True, error_messages={
'type': 'main_field_b must be int'
})
nested_field = fields.Nested(OneLevelNestedResource, required=True, many=False, error_messages={
'type': 'nested_field must be dict'
})
Because adding Meta.ordered = True would determine marshmallow to use ordered dicts (this is redundant in python3.7) and ordered sets for manipulating the fields, we would then get only one possible error when running load_wrong_data_b()
:
{'nested_field': {'_schema': ['field_a must be string']}}
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:6 (6 by maintainers)
Top GitHub Comments
Ok. I will try to see if I can adapt the existing fix you made from v3 to v2.x and then create a PR. Hopefully I will be able to get it done by the end of the week.
Thanks for the detailed bug report.
2.x is still maintained and should be fixed. A PR would be absolutely welcome.