question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support Json-like polymorphism in Cbor

See original GitHub issue

This is more a question than a feature request…

I’m working in an environment where JSON messages are being published from an Android App, written in Kotlin, via MQTT, to a Python-based backend where these messages are being decoded and processed. Serialization in the Android App is done with kotlinx.serialization, of course… 😉

The messages are being serialized from a wrapper class which is implemented as follows:

@Serializable
class ExternalMessage(val timestamp: Long = System.currentTimeMillis(), val msg: ExternalMessageBase)

The ‘real’ message content is store in the msg property of this wrapper and derived from the ExternalMessageBase class:

@Serializable
sealed class ExternalMessageBase(@Transient var messageDirection: Direction = Direction.OUT)

Now I do have one specific message type, which contains image data, that I don’t want to encode in JSON but in CBOR to keep the message size minimal and to get rid of encoding the image data to a base64 string in the App. The class for this message is implemented as follows:

@Serializable
class ImageDataMessage @OptIn(ExperimentalSerializationApi::class) constructor(
    @SerialName("image_data") @Contextual @ByteString val imageData: Mat,
    @SerialName("camera_position") val cameraPosition: CameraPositionData,
    @SerialName("pixel_per_meter") val pixelPerMeter: Float = 0.0f
) : ExternalMessageBase()

The encoding of this message in JSON results in a slightly different structure than encoding in CBOR (this output has been generated using json.loads/cbor2.loads on the Python side):

(JSON)

{u'msg': {u'type': u'ImageDataMessage', u'image_data': u'', u'pixel_per_meter': 2221.0, u'camera_position': {u'position': {u'y': 0.0, u'x': 0.0, u'z': 0.0}, u'orientation': {u'y': 0.0, u'x': 0.0, u'z': 0.0, u'w': 0.0}}}, u'timestamp': 1665569404785}

(CBOR)

{u'msg': [u'ImageDataMessage', {u'image_data': '', u'pixel_per_meter': 2221.0, u'camera_position': {u'position': {u'y': 0.0, u'x': 0.0, u'z': 0.0}, u'orientation': {u'y': 0.0, u'x': 0.0, u'z': 0.0, u'w': 0.0}}}], u'timestamp': 1665567516213}

As one can see, in the JSON output, the ImageDataMessage is encoded into one map which also contains the type attribute whereas in CBOR the ImageDataMessage is encoded into a list which contains the type attribute and a map which contains the remainder of the ImageDataMessage object.

What I would like to achieve is that the result of the serialization for CBOR is the same as for JSON because that would prevent implementation of a big amount of changes to the processing logic in the Python-based backend. Ideally, I would just replace

import json

with

import cbor2

in my Python code and the processing logic works the same regardless of the format the message was encoded to.

Is this achievable somehow using kotlix.serialization e.g. by changing the CBOR configuration or the implementation of the Message classes?

Thanks in advance!

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
sandwwraithcommented, Oct 17, 2022

@jsiebert You can take a look at StreamingJsonDecoder.decodeSerializableValue https://github.com/Kotlin/kotlinx.serialization/blob/0a1b6d856da3bc9e6a19f77ad66c1b241533a20b/formats/json/commonMain/src/kotlinx/serialization/json/internal/StreamingJsonDecoder.kt#L53

There are several takeaways:

  1. To intercept default array-based polymorphism, one need to check if (deserializer is AbstractPolymorphicSerializer<*>) and then add custom behavior.
  2. Cbor objects, as Json objects, are maps. And that means that the type key for polymorphism may be in arbitrary place inside this map.
  3. Current json implementation optimistically checks the first key in the object, and checks if it is the type key to load serializer. If it is not there, it falls back to default behavior. This optimization is probably not necessary for CBOR.
  4. To search for type key in arbitrary object in arbitrary place, one probably needs intermediate data structure. For Json, this is JsonElement — first object string is parsed to JsonElement, then type key extracted, then the rest of JsonElement is parsed to an actual Kotlin object using separate JsonTreeDecoder (decoder is separate because its input is JsonElement, not String).
  5. That’s why this feature requires much work in first place — there’s no analog for JsonElement and JsonTreeDecoder in Cbor (yet).
  6. Alternatively, it is probably possible just to store CBOR nested object bytes in some intermediate place and read them twice — first to find type key, then to deserialize to Kotlin object with actual deserializer. It will probably be much simpler, but this is for you to find out.

Hope this helps. Good luck!

0reactions
jsiebertcommented, Oct 15, 2022

Hello @qwwdfsad,

thanks for the hint! I took a look at the code and tried to implement something, but indeed this seems to be a bigger task. I’d like to keep trying to implement this, but any pointers on where to start or what to look for would be greatly appreciated!

Read more comments on GitHub >

github_iconTop Results From Across the Web

CBOR — Concise Binary Object Representation | Overview
CBOR is based on the wildly successful JSON data model: numbers, strings, arrays, maps (called objects in JSON), and a few values such...
Read more >
RFC 7049: Concise Binary Object Representation (CBOR)
They are binary floating-point values that can exceed the range or the precision of the three IEEE 754 formats supported by CBOR (Section...
Read more >
CBOR byte string support · Issue #1129 · nlohmann/json
Parsing CBOR to JSON (deserialization). There is currently no support for binary strings. It could be added, e.g. by either implementing ...
Read more >
RFC 8949 - Concise Binary Object Representation (CBOR)
Concise Binary Object Representation (CBOR) (RFC 8949, December 2020. ... The format must support all JSON data types for conversion to and from...
Read more >
Announcing PotentCodables - Related Projects
... Support polymorphic type encoding/decoding while still allowing Swift ... CBOR & ASN.1 are currently supported and support for YAML is ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found