question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unsafe non-atomic config file creation.

See original GitHub issue

https://github.com/emissary-ingress/emissary/blob/b2b54f10ee38ff8068454b91b2676d762b6ed570/python/ambassador_cli/ambassador.py#L503 generates envoy.json (or whatever path is given) non-atomically.

This leads to envoy reading empty or partially written config files, dropping all incoming traffic as having no upstream temporarily, for a few ms, then recovering.

(If the Python path isn’t in use anymore, this is the error we were debugging)

time="2022-09-05 17:52:03.3138" level=warning msg="/ambassador/envoy/envoy.json: proto: syntax error (line 1:1): unexpected token " func=github.com/emissary-ingress/emissary/v3/pkg/ambex.update file="/go/pkg/ambex/main.go:393" CMD=entrypoint PID=1 THREAD=/ambex/main-loop

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:6

github_iconTop GitHub Comments

1reaction
rbtcollinscommented, Sep 6, 2022

Some thoughts on solutions.

If the fs is posix / we’re on the top layer, we can write to {output_json_path}.tmp, then rename to {output_json_path}.

If it is not, then we can take out a read lock to exclude other readers, write new contents, then release the read lock.

0reactions
rbtcollinscommented, Sep 9, 2022

We’re not 100% sure that this analysis is correct - we have several smoking guns (the log message, observed 0-length envoy,json files (using a high frequency watch, we see them go from size X, to 0, to X again), and the log-but-continue Go code.

However its not clear what the constraints are (is a temp file that gets renamed atomically [combined with an ignore filter on the naming pattern for temp files]), or must the write be done in place; and we’re not sure that the identified python code is whats actually doing the truncation: there are multiple unsafe-if-observed places in the code.

So, we’d love it if you fixed this, but we’re not really positioned to say more than we have so far. Sorry!

Read more comments on GitHub >

github_iconTop Results From Across the Web

What's the difference between the atomic and nonatomic ...
Nonatomic means multiple threads access the variable (dynamic type). Nonatomic is thread-unsafe, but it is fast. Share.
Read more >
Atomic vs. Non-Atomic Operations - Preshing on Programming
In this post, I'll compare atomic loads and stores to their non-atomic counterparts at both the processor level and the C/C++ language level ......
Read more >
Confluence crash can result in empty confluence.cfg.xml file.
The best fix is to do atomic writes: when saving any config file, write to a temp file then overwrite the existing file...
Read more >
V6074. Non-atomic modification of volatile variable.
The analyzer has detected a non-atomic modification of a ′volatile′ variable, which may result in a race condition.
Read more >
Kubernetes configuration file is group/world-readable · Issue ...
This is insecure. ... WARNING: Kubernetes configuration file is world-readable. This is insecure. ... On WSL Linux create a etc/wsl.conf.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found