Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Serialization layer as serious bottleneck

See original GitHub issue

We have been investigating the performance of our code node. Among a great many thing we managed to optimized and achieved some “ok-ish” numbers. A closer look now revealed that about 80% of the performance now goes away in the serialization layer. This is kind of unexpected as I would rather have expected the database, hashing and asymetric crypto to be the main bottlenecks. The situation is aggrevated by the fact that every transaction has a transaction id. This in turn is computed as the hash of all its elements (input states, notaries, output states, time windows, etc.), which each triggers a serialization again.

For testing purposes to get a closer view we made use of:

  private static SerializedBytes serialize(Object object) {
        SerializationFactory defaultFactory = SerializationFactory.Companion.getDefaultFactory();
        SerializationContext defaultContext = defaultFactory.getDefaultContext();
        return defaultFactory.serialize(object, defaultContext);
    }

and serialized a single state with about two dozen fields. A resulting byte array was 3869 bytes long. One CPU core managed to serialize 2800 of those objects every second. If we assume that a great many objects are part of a transaction, then the pictures gets clearer why it takes this amount of time.

To give a reference, we serialized the same object with ObjectMapper from Jackson by first constructing a writer for the desired state type and then measured performance serializing that state object. Jackson managed to serialize 99500 objects every second. A factor 40 compared to AMQP. The json length of the result was 1065. I consider JSON rather ineffient but managed to be 75% smaller than AMQP while still being “standalone” not requiring a external model to deserialize. ProtoBuffer and friends would be another order of magnitude, but at the cast of an external model.

When looking at it with a profiler, ones sees:

grafik

There is heavy work needed to serialize a great number of DescribedTypeElement. A closer look at the implemention shows for example:

val data = Data.Factory.create()
data.withDescribed(Envelope.DESCRIPTOR_OBJECT) {
       withList {
           writeObject(obj, this, context)
           val schema = Schema(schemaHistory.toList())
           writeSchema(schema, this)
           writeTransformSchema(TransformsSchema.build(schema, serializerFactory), this)
      }
}
....

see https://github.com/corda/corda/blob/4dd51de5c1d14901ce143502c21b87ac0863543f/serialization/src/main/kotlin/net/corda/serialization/internal/amqp/SerializationOutput.kt

as a first measure might be to cache the serialization of the schema part to directly get the byte array from a given cached schema history. Maybe providing a decent speed-bump. For a database perspective it may also would proof worthwile to seperate the data and the model seperately, avoiding the redudant storage of the model part.

For Corda applications to move towards more high-volume applications, this ticket feels rather important. Alternatively it would kind of also be nice to see plain JSON support (or something similar). There is widespread support across all devices, easy to read/write, standards how to compute a signature and very performant implementations.

Issue Analytics

State:
Created 4 years ago
Reactions:2
Comments:9 (7 by maintainers)

Top GitHub Comments

1reaction

remmeiercommented, Dec 12, 2019

first draft in the commit above, 8x the performance

1reaction

remmeiercommented, Dec 11, 2019

I took a deeper look at it in order to fix it. We expect a few ten million records, so performance is critical.

There is a “quick” solution to just make things faster by caching the schema. What I managed to do is instead of serializing the schema, I only put a small player holder into the data structure:

SerializationOutput:

  open fun writeSchema(schema: Schema, data: Data, context: SerializationContext, optimize: Boolean) {
       var placeholder = byteArrayOf(0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11)
       data.putBinary(placeholder)
       // data.putObject(schema)
    }

And in a second step I can patch the AMQP structure with the real schema. There are four main elements:

envelop
data
schema
transformation (not seed to be used so far)

It is rather straighforward to find the right place and do that replacement. In our first use case the schema makes up 90% of the serialized bytes. This in turn safes about a factor of 10x of serialization work (with some new, simple array manipulations). As a minor catch, one further has to patch the AMQP ListElement which holds the total size of all its data, which changes due to the placeholder. As further minor complication AMQP stores the length of the schema in the serialized output, which in turn has a variable length coding depending on whether it is larger or smaller than 255 bytes.

Compared to Jackson it will still a bit slow should I gain that factor 10x. But it is not so suprising. For example, that encoding of the size by AMQP makes things more complicate. It has to traverse the complete object graph to compute the size of subelements, making that part similarly expensive as the “real” serialization.

If there is an interest in a PR, I could do that, I’m close to finishing it up for our use case. I’m carefully optimistic to achieve about 1000 tps on a older eight core machine (with further other optimizations) rivaling those official 32/64-core numbers.

But general question is where to move in this area. The serialization mechanism is inefficient and taking a huge amount of space in the database, so a split of model and data would be desirable, at least for storage. So maybe the manipulation above could be a starting point for that as well. Or of course the possiblity to support someting else like JSON (maybe even https://www.w3.org/TR/vc-data-model/ to allow interaction with other systems). Since all this impacts both long-term storage and the interaction with non-corda systems/clients, IMHO simplicity could be an important characteristic. Understanding and replicating all the AMQP things is rather challening and support beyond Java very limited. If there is an interest for JSON prototype PR, may would find time as well. I’m not quite sure if this will ever be an option or more a “hell will first freeze over” scenario 😃 since there has been quite some investiment in AMQP for serialization. For sure it would have to complement the existing serialization rather than replace it.

Top Results From Across the Web

Is serialization/deserialization often a service bottleneck?

When using Protobuf on a non-compressed environment, the requests took 78% less time than the JSON requests. This shows that the binary format...

Serialization takes too long - Stack Overflow

I serialize all the objects contained in a list each time a new object is added (kind of an history if the application...

Serialization - Documentation - Akka

One may think that network bandwidth and latency limit the performance of remote messaging, but serialization is a more typical bottleneck. Note. Akka ......

Serializing Things for Celery - OddBird

Given that the safest way to do so is to use a serialization format ... are other ways to pass objects through the...

Serialization and Deserialization of Python Objects: Part 2

This is part two of a tutorial on serializing and deserializing Python ... Is serialization/deserialization a performance bottleneck?