[schema registry] Avsc details leak through prototypes given to deserialized objects
See original GitHub issueForking distinct issue from #10950
Decide if it is OK that deserialized objects are given a prototype from avsc causing a slight observable difference between plain data going in and data coming back.
- This may not be exactly what’s happening here. Put together a repro and figure out exactly what causes deepEqual to fail currently. See https://github.com/Azure/azure-sdk-for-js/pull/10890#discussion_r482449414
I debugged through this and put a repro of the underlying cause below.
Summary: avsc
gives objects it deserializes a prototype that causes the result returned to have 7 additional inherited enumerable properties: clone, compare, isValid, toBuffer, toString, wrap, wrapped. This is what causes chai to throw on deepEqual comparison as it considers all enumerable properties, including functions AFAICT. JSON.stringify(left) === JSON.stringify(right), which was the workaround pending investigation holds because the extra enumerable properties are functions, which JSON.stringify skips.
So now the question remains: is this OK? I am slightly concerned that consumers of schema-registry-avro would take a dependency on these functions, which might prevent us from replacing the avro serializer with another implementation due to the risk of breaking such uses. But maybe that is too paranoid. @xirzec Thoughts?
So far it appears to be non-trivial to strip these properties out (or make them non-enumerable if we prefer) as the return value can be a graph of objects where each of the objects would have them. I’m looking into whether the resolver arg to fromBuffer might be able to allow me to do this cleanly.
Repro of underlying cause
import * as avro from "avsc";
import { assert } from "chai";
const schema: avro.schema.RecordType = {
type: "record",
name: "User",
namespace: "com.azure.schemaregistry.samples",
fields: [
{
name: "firstName",
type: "string",
},
{
name: "lastName",
type: "string",
},
],
};
const type = avro.Type.forSchema(schema);
const value = { firstName: "Nick", lastName: "Guerrera"};
const serialized = type.toBuffer(value);
const deserialized = type.fromBuffer(serialized);
for (const key in value) {
console.log(key);
}
console.log("");
for (const key in deserialized) {
console.log(key);
}
assert.deepEqual(value, deserialized);
> node .\index.js
firstName
lastName
firstName
lastName
clone
compare
isValid
toBuffer
toString
wrap
wrapped
C:\Temp\rp\node_modules\chai\lib\chai\assertion.js:141
throw new AssertionError(msg, {
^
AssertionError: expected { Object (firstName, lastName) } to deeply equal { Object (firstName, lastName) }
at Object.<anonymous> (C:\Temp\rp\index.js:50:15)
at Module._compile (internal/modules/cjs/loader.js:1075:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1096:10)
at Module.load (internal/modules/cjs/loader.js:940:32)
at Function.Module._load (internal/modules/cjs/loader.js:781:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
at internal/main/run_main_module.js:17:47 {
showDiff: true,
actual: { firstName: 'Nick', lastName: 'Guerrera' },
expected: User { firstName: 'Nick', lastName: 'Guerrera' }
}
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (6 by maintainers)
Top GitHub Comments
Happy to help. Thanks for the kind words 😃.
Hi there. If you update
avsc
to5.5.1
, you can use an option to omit these methods:Note that decoded record values will still have a named constructor for performance reasons. If you’d like to hide this constructor as well, you can do so with a type hook. For example to copy the values into plain objects:
Sample usage:
You can tweak the logical type’s implementation above to match the API you settle on:
_fromValue
generates the decoded values exposed to users,_toValue
determines the data you’d like to accept when encoding. The logical type documentation has more information, including a couple examples.