Reduce confusion around fields.Pluck behavior
See original GitHub issueFollow-up to discussions in https://github.com/marshmallow-code/marshmallow/pull/1298#issuecomment-513354502 and https://github.com/marshmallow-code/marshmallow/issues/1311.
@deckar01 wrote:
My confusion with the behavior of Pluck concerns me a little. Currently Pluck = flat -> nested -> flat, but it is easily confused with this use case nested -> flat -> nested. Pluck describes the operation during serialization which is similar to Function and Method, but it is really the opposite operation during deserialization. Would it make sense to provide a field that performs the opposite operation? Nest?
@sloria wrote:
I agree that Pluck is easily misunderstood. I improved the documentation a bit in #1314. Perhaps we could improve the naming to make it more clear, as well as provide a field for the reverse operation (Nest? Deep?)
One possible solution might be to rename (or alias, to not break compat) Pluck
to Flat
and provide a Deep
field which does the opposite (what I called Reach
in https://github.com/marshmallow-code/marshmallow/issues/1311#issuecomment-513478342).
Feedback welcome!
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:6 (3 by maintainers)
Top GitHub Comments
I’ve been revisiting the use of Reach which just so happened to be required in the second use case of my project using Marshmallow and have spent a bit of time trying “back of the napkin” a solution for this.
Before I begin, another issue I ran into with using Reach with identical data keys was that when I got validation errors they all read
{'data_key': ['Not a valid integer.']}
. Since almost all my fields had the same data_key and the validation also did not print the value that failed to validate the validation errors may as well of been printing nothing.Alright, back to the “dotted path issue”.
Personally I feel that the use case of reading some value out of a nested location on source data should be supported at the base
Field
level. I don’t think the user should have to write Reach fields, or Pluck fields or any variation of field that requires one of it’s arguments to be define the actual data type of the field.With your point in mind that some json keys themselves may have '.'s in them I have tried to layout a test that has some worst case fields and then try and figure out how I’d like to be able to use
fields.IntegerField()
(for example) to read the varying pieces of data.Here it is:
I liked the idea of the keyword argument
path
from theReach
recipe and thought it could be possible to simply incorporate it into a base field. You’d then be able to choose to usepath
(for deep json lookup) ordata_key
for the simple use case of pulling it from the top level.I also realised with my above example that
path
could not simply be a dotted string since we’d have no way to know what to return withpath=nested.bar.baz
in the above example.I propose something along the lines of passing the following kwargs to
fields.Integer()
. There is no uncertainty even in this fairly extra situation where the data should be read from.I’d also be happy with
data_key
being sent as either a single string object or a list of strings however this adds some annoying type checking all over the place when initialising fields.I’d be keen to belt out an MR for this (or something similar that lets me use Base Field types) which at the same time solves the duplicate data_key issue and the Validation data_key issue and means I don’t have to use the
ReachField()
workaround. Thoughts?I would prefer having both
path
anddata_key
for the reason that thedata_key
value is sometimes derived from the field name, i.e. when it’s converted fromsnake_case
tocamelCase
.If the
path
would be an independent parameter, it would make it easy to keep the case conversion automatic, while still being able to set up nesting.