question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Non-determinism in JSON export

See original GitHub issue

Description of issue or feature request:

I’ve been working to create a deterministic test fixture generator for PHP-TUF. I’ve rooted out the apparent sources of most meaningful non-determinism by fixing the clock and using a fixed well of keypairs. However, some of the JSON export appears to have different behavior on different systems.

Shown below is the diff I see when comparing generated data on GitHub Actions (on Python 3.9 with ubuntu-latest) versus on my laptop (also Python 3.9 but with Fedora 33). We’ve pinned all known dependencies using pipenv, so I don’t think it’s that.

This causes a cascading set of differences because other files use hashes of snapshot.json.

Could TUF canonicalize even the JSON data that isn’t directly signed?

Current behavior: Screenshot from 2020-11-12 12-30-35

Expected behavior:

Deterministic (ideally canonical) output of JSON that contains the same functional data.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
lukpuehcommented, Nov 13, 2020

Could TUF canonicalize even the JSON data that isn’t directly signed?

Note that python-TUF currently does not canonicalize any JSON metadata (not even the payload aka. “signed” part) on the wire, although there is a proposal to change this at least for the “signed” part to not require any JSON parsing of untrusted metadata see (https://github.com/secure-systems-lab/signing-spec/pull/2).

Canonicalization of the entire metadata is not required by the spec, because file hashes of targets.json, $delegated-targets.json (in snapshot.json) and snapshot.json (in timestamp.json) are generated and then re-generated for client verification over the same file blob (without need for JSON parsing/canonicalization).

Regardless, @erickt has made a similar request in https://github.com/theupdateframework/tuf/issues/1154 (in a similar context, i.e. interoperability testing).

I’m fine with implementing his suggestion.

0reactions
jkucommented, Feb 11, 2022

State as I understand it:

  • We try to be deterministic in Metadata API: e.g. JSON dictionary content is sorted on output
  • We do not canonicalize the output JSON (signed content is canonicalized but that is not used as the output format)

I’m closing this as it is about the legacy code which we no longer maintain: if you see a similar issue using Metadata API, please open a new issue

Read more comments on GitHub >

github_iconTop Results From Across the Web

GEN4: Towards Nondeterministic JSON Formatting – The GDELT ...
Historically, GDELT's JSON datasets relied on standardized external JSON libraries to construct the output values, but we then assembled them into strictly ...
Read more >
Deterministic and Nondeterministic Functions - SQL Server
Nondeterministic functions may return different results each time they're called with a specific set of input values even if the database ...
Read more >
protojson output is non deterministic #1373 - golang/protobuf
Non-determinism doesn't interfere with storing protos as JSON since you can still unmarshal the data when reading it out of a database. This...
Read more >
Chaotic Good: Creating Determinism Where None Exists • Adam Tuttle
When is serializeJson non-deterministic? ... Determinism isn't about the value created, it's about the consistency of that value. If the output depends on...
Read more >
SYNTAX AND SEMANTICS FOR CINNAMON PROGRAMMING ...
April 24, 2019 Journal article Open Access. SYNTAX AND SEMANTICS FOR CINNAMON PROGRAMMING. Kostadin Kratchanov. JSON Export.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found