CVE-2019-9843: The XML parser isn't respecting resolveExternalEntities as false
See original GitHub issueOriginal Comment: https://github.com/diffplug/spotless/issues/308#issuecomment-463297964
12:48:55.013 [DEBUG] [sun.net.www.protocol.http.HttpURLConnection] Redirected from http://java.sun.com/xml/ns/javaee/javaee_5.xsd to http://www.oracle.com/webfolder/technetwork/jsc/xml/ns/javaee/javaee_5.xsd
If we are seeing HTTP get requests inside of the XML parser that means that the parser is vulnerable to XXE.
We need to fix this so that the spotless XML formatter is not making external entity requests.
We can’t have our linting infrastructure making web requests. Especially web requests over HTTP as those can be maliciously intercepted by a MITM.
Here’s an example where this has been a serious problem in the past.
https://research.checkpoint.com/parsedroid-targeting-android-development-research-community/
CC: @nedtwigg
This is a security vulnerability in spotless and should be treated as such.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:31 (31 by maintainers)
Top Results From Across the Web
CVE-2019-9843 Detail - NVD
CVE-2019-9843 Detail ... 3.20.0 (Gradle plugin), the XML parser would resolve external entities over both HTTP and HTTPS and didn't respect ...
Read more >XML parser configured does not prevent nor limit external ...
This behavior exposes the application to XML External Entity (XXE) attacks, which can be used to perform denial of service of the local...
Read more >XML parsers should not be vulnerable to XXE attacks
When parsing the XML file, the content of the external entities is retrieved from an external storage such as the file system or...
Read more >XML External Entity Prevention - OWASP Cheat Sheet Series
This attack occurs when untrusted XML input containing a reference to an external entity is processed by a weakly configured XML parser.
Read more >Reason codes listed by value - IBM
Action: The PIMA passed has not been initialized with a call to the z/OS XML parser initialization service GXL1INI or GXL4INI or the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Brief explanation why I don’t consider this urgent:
About the code:
It would be interesting to know what the Xerces/JRE defaults are… Anyhow, I know that the initial URI lookup for the XML source model is not done by Xerces, but by the XML catalogue/cache manager. And I am not sure whether Xerces plays a role for the XML model setup.
@JLLeitschuh consider this as an introduction. Let me know if you want to take over…
I am afraid we have quite a different view point. To clarify mine:
My background
I have no clue, not tested and never seen how DTDs ENTITY statements are treated within Eclipse. I only work with schemas. The URI lookup for DTDs and XSDs in Eclipse is the same. I know that part of the Eclipse code.
My initial intention for #308:
My spring4-mvc-gradle-xml-hello-world example in #308, which started this entire discussion was meant to show that missing catalogue entries may hamper your build.
To avoid further misunderstandings: I never said that a missing catalogue entry (for DTD or XSD) caused #308. Since I was never able to reproduce #308 I waited for further feedback.
I realized that many user may not expect a schema URI resolution from an XML formatter. Hence I provided the example to see whether there is an URI resolution in the projects of the people reported #308 and whether due to network timeouts the user gets the impression that that the build hangs “forever”.
Why I don’t feel strongly about resolveExternalEntities:
The Spring schema location lookup I demonstrated in my example from #308 is due to the following lines in the XML under test:
The term for the URI used for schema location lookup is location hint. This name (hint) should be taken literally. It is a hint where the schema may or may not be. It is the last resort you try if you can’t find it. Location hints for old legacy code have the bad habit to become invalid since the company behind it is sold, re-branded, …
Schema location hints should never be misused by a project for downloads, neither during its build nor after installation at customer side. You must ensure to provide it. When you want to know which catalogues are on your Linux OS, have a look at
/etc/xml/catalog
. JRE comes with an own set of catalogues. Java applications can extend these catalogues. Eclipse provides also an additional catalogue and all plugins can add more entries. The user can also add project specific catalogues.Spotless WTP allows to insert an additional userCatalog .
So for me the HTTP request which started this discussion was intended to demonstrate a broken project and to figure out, whether this could really cause #308.
Proposal:
With respect to the many misunderstandings in this discussion, I think it is better to switch off the lookup by default.
To stress this point: I have no evidence and don’t think that the Eclipse XMLSourceParser is vulnerable to the XXE scenario where a local file URI is used as external entity as described within this issue. This proposal has nothing to do with security. But I encourage of-course everybody to investigate further. Pleas have a look at the links I provided above. Just don’t report a potential security vulnerability to Eclipse and reference this proposal.
I once done a bypass of the Eclipse WTP URI resolver when developing the initial Spotless WTP wrapper without the underlying OSGI framework. I wanted to avoid such a big deviation from the original Eclipse behaviour and therefore get more proves that it caused #308.
But I can understand that most users, just wanting their XML formatted, do not expect that the schemas are parsed and do not want/need to care about details like catalogues. I admit that my WTP formatter is unnecessarily complex in that respect for 90% of the use cases. Eclipse continues its formatting regardless whether the lookup of the URI was successful. Since therefore the formatting my be altered (for example once with and once without whitespace facets), the results differ. Hence the build results may become unstable due to little details in the XML structure model most users don’t care about in the first place. This is no problem for Eclipse IDE, but for build tools.