question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TS Bebop views are shared which leads to difficult to diagnose bugs

See original GitHub issue

Describe the bug Okay, this might technically be a feature, but it was undocumented and so I spent a good bit of time assuming my code was wrong only to find out it had to do with serialization logic assumptions not holding up.

To Reproduce This test passes

it("correctly encodes and decodes datagrams", () => {
    const data = new Uint8Array([7, 0, 0, 0, 75, 86, 83, 116, 111, 114, 101]);
    const dgram1: IRpcDatagram = {
        discriminator: RpcResponseOk.discriminator,
        value: {
            header: {id: 1},
            data,
        }
    };
    // this is the current state at runtime
    // {"discriminator":2,"value":{"header":{"id":1},"data":{"0":7,"1":0,"2":0,"3":0,"4":75,"5":86,"6":83,"7":116,"8":111,"9":114,"10":101}}}
    // 17,0,0,0,2,1,0,11,0,0,0,7,0,0,0,75,86,83,116,111,114,101
    const raw = RpcDatagram.encode(dgram1);
    const dgram2 = RpcDatagram.decode(raw);
    expect(dgram2.discriminator).toBe(RpcResponseOk.discriminator);
    if (dgram2.discriminator == RpcResponseOk.discriminator)
        expect(dgram2.value.data).toEqual(data)

    expect(dgram2).toEqual(dgram1);
    expect(dgram2).not.toBe(dgram1);
})

And this test fails

it("correctly encodes and decodes datagrams and data", () => {
    const name1: I_HelloServiceNameReturn = {value: "KVStore"};
    // this line would fix it
    // const data = new Uint8Array(_HelloServiceNameReturn.encode(name1));
    const data = _HelloServiceNameReturn.encode(name1);
    expect(data).toEqual(new Uint8Array([7, 0, 0, 0, 75, 86, 83, 116, 111, 114, 101]));

    const dgram1: IRpcDatagram = {
        discriminator: RpcResponseOk.discriminator,
        value: {
            header: { id: 1 },
            data
        }
    };
    // this is the current state at runtime
    // {"discriminator":2,"value":{"header":{"id":1},"data":{"0":7,"1":0,"2":0,"3":0,"4":75,"5":86,"6":83,"7":116,"8":111,"9":114,"10":101}}}
    // 17,0,0,0,2,1,0,11,0,0,0,7,0,0,0,2,1,0,11,0,0,0
    // As you can see the buffer ^ is not the same as when we used our own buffer even though we confirmed the data was the same
    const raw = RpcDatagram.encode(dgram1);
    const dgram2 = RpcDatagram.decode(raw);
    expect(dgram2.discriminator).toBe(RpcResponseOk.discriminator);
    if (dgram2.discriminator == RpcResponseOk.discriminator) {
        expect(dgram2.value.data).toEqual(data); // <-- Fails on this line
        const name3 = _HelloServiceNameReturn.decode(dgram2.value.data);
        expect(name3).toEqual(name1);
    }

    expect(dgram2).toEqual(dgram1);
    expect(dgram2).not.toBe(dgram1);
})

Basically just use the serialized data from one type as a byte[] value for another and serialize and then weird things happen.

There is another case where this can and will pop up, which is if you serialize data, and then for ANY reason await before using the buffer. This could happen in networking or subtle code logic, like if you were to have

async myFn() {
  const buf = MyType.encode(obj)
  await doThing(buf)
  await doOtherThing(buf)
}

because it is very possible, depending on what triggers myFn, to have buf change before doOtherThing is called.

This is also an issue where you call say a function for a database to do a thing with buf and, without you even knowing, it chooses to await something before it uses your provided buffer. At that point it would be almost impossible to debug without knowing about this behavior why the data in your database is not the data that was submitted to the function.

You will likely only run into this secondary case if you are writing network code which triggers sterilization as multiple concurrent operations may occur naturally. What is worse about this, is that it is effectively a race condition that is very subtle and almost impossible to reproduce and debug.

Expected behavior When I get the serialized data, I expect it to continue living until I drop the last reference to it unless otherwise documented.

IMO, in a language like C++, it makes some sense to do this and expect the user to know and proceed with caution, but you would also probably use a mutex on the buffer. In TS there is NO mutex which is fine as long as you don’t run into that secondary case of await-before-use because JS is single-threaded. At minimum this needs to be documented, but JS is not a language focused on performance so I think it would be better to make a new BebopView for every time encode is called by default as it is the least likely to cause weird results. We should allow the user to bring their own view and that would allow more efficient serialization where they know what they are doing.

Bebop info:

  • Version: 2.4.2-ish (rpc-ts branch)
  • Runtime: TS

Desktop (please complete the following information):

  • OS: Windows
  • Version: 10

Additional context Working on RPC and for this we have a datagram which contains a raw data field which is another bebop serialized type. This is how this behavior was discovered by I, @Eadword, a person who was not particularly familiar with the serialization implementation and made what I thought were reasonable assumptions about behavior.

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
lynncommented, Apr 7, 2023

I would lean towards making the change out of the “principle of least surprise”, and then exposing the old fast-but-dangerous behavior as encodeUnsafe(). 👍

0reactions
andrewmd5commented, Apr 7, 2023

I implemented a fix for this in #225 but I’m holding off on merging. Copying memory, even on small data structures, reduces performance substantially:

Bebop (subarray)   encode Song x 7,225,130 ops/sec ±0.08% (101 runs sampled)
Bebop (subarray)   decode Song x 2,861,336 ops/sec ±0.11% (100 runs sampled)
Bebop (slice)   encode Song x 2,259,026 ops/sec ±2.66% (80 runs sampled)
Bebop (slice)  decode Song x 2,823,515 ops/sec ±0.28% (94 runs sampled)

The encode method creates the illusion and promise that the buffer returned to you is unique and safe to access at anytime and that the data will remain the same; this is of course, not true, and it’s possible for Bebop to partially decode “corrupted data” (essentially the data of another type encoded prior to accessing the buffer) which could lead to all sorts of issues. If the decode function simply blew up, I’d be less worried, but given the circumstances this requires deeper discussion.

Do we prioritize safety over speed or more clearly document this runtime behavior to maintain speed over ergonomics and safety?

@lynn tagging you for comment since you wrote the original implementation and I’d appreciate your perspective

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · betwixt-labs/bebop
betwixt-labs / bebop Public ... TS Bebop views are shared which leads to difficult to diagnose bugs bug Something isn't working typescript Issues...
Read more >
A General Approach to Inferring Errors in Systems Code
Bugs as Deviant Behavior: A General Approach. to Inferring Errors in Systems Code. Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and...
Read more >
Runtimes for zing and bebop on models from Figure 4
Model checking is a technique for finding bugs in systems by systematically exploring their state spaces. We wish to extract sound models from...
Read more >
github/git Checkout Returns 'error: invalid path' on Windows
After I opened an issue on the git-for-windows bug tracker ... -R -f -i p.patch patching file Writing-Bops:-The-Bebop-Schema-Language.md.
Read more >
Dji drone app. Details can be found in the user manual. ...
You can also create a profile and share interesting aerial photos with other users. The full potential of your DJI drone and filming....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found