question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Non-deterministic ordering of printed component attributes

See original GitHub issue

I am writing ARMI cases programmatically, and then as each case is created, using the writeInputs method to print the blueprints file. I happened to notice that each time the blueprints file is written, the ordering of written component attributes (i.e. material, temperature, dimensions) changes, even when nothing has changed between the cases. Even if I just write the same case twice by executing my scripts twice in a row, the ordering may change.

It would be nice for the ordering to stay the same so that the written input files can be easily diff-ed against each other.

A quick look into ARMI makes me think that this is actually an issue with yamlize, but confirmation of that would be helpful. This seems odd, considering Python now enforces dictionaries to be insertion-ordered. But if it is something within ARMI, my feeling is that it would be beneficial to update, if possible.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:13 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
john-sciencecommented, Jan 26, 2022

Okay, so this is kind of our problem.

Essentially, Yamlize is really good about deterministically ordering elements before they get written out to the YAML file.

But we go out of our way to edit the items that go into the YAML file on the fly, after Yamlize does all of it’s setup and organization. For instance, in all the methods called addDegreeOfFreedom() in suiteBuilder.py:

https://github.com/terrapower/armi/blob/49ae3fd7b1a2934dbf0f642d9429bb4b1c75ab99/armi/cases/suiteBuilder.py#L274-L277

I can say that I have made a TON of YAML files with our code testing this bug, and they are very largely deterministically ordered. You have to get 4 levels down into the data heirachy inside them to start seeing non-deterministic ordering.

I am looking to see if there is a low-time-cost way of solving this.

0reactions
john-sciencecommented, Feb 3, 2022

It’s an interesting point.

The major reason I discounted this idea to begin with is that several downstream projects (that import ARMI) don’t want PYTHONHASHSEED to be set. For instance, if you are running Monte Carlo simulations, you might/probably want the full power of your pseduo-random-number generators.

Perhaps setting the seed in ARMI wouldn’t affect downstream projects. Or perhaps we could make it so. But:

  1. I would need proof we can make that safe.
  2. For those downstream projects that don’t want the seed set, the YAMLs would be non-deterministic again.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Why is dictionary ordering non-deterministic? - python
From Python 3.7, this order-preserving behaviour is guaranteed: the insertion-order preservation nature of dict objects has been declared to be ...
Read more >
Configuring Routes Startup Ordering and Autostartup
Now you have fine-grained control in which order the routes should be started. There is a new attribute startupOrder which is an Integer...
Read more >
Testing non-deterministic code - HitchDev
Non-deterministic code is code which can produce different outputs even when it is given the ... Where not having an order by is...
Read more >
tf.config.experimental.enable_op_determinism - TensorFlow
map producing inputs or running stateful ops in a nondeterministic order. Enabling determinism will remove such sources of nondeterminism.
Read more >
Formal verification of neural agents in non-deterministic ...
Employing ML components has considerable attractions in terms of ... i.e., a language expressing temporal properties realisable in a limited ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found