question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unnecessary double encoding of url

See original GitHub issue

The url http://test.com/& is not treated correctly as of version 1.10.2 (or maybe even 1.10.1).

Jsoup.connect("http://test.com/" + URLEncoder.encode("&", "UTF-8")).get();

The url that gets passed to Jsoup is now http://test.com/%26 because the & was url encoded. So everything works fine in 1.9.2 because the encodeUrl(String url) method in the HttpConnection class does not modify the given url in this example because there is no space in the given url. The same url in 1.10.2 gets encoded again in the encodeUrl() method which leads to the following url: http://test.com/%2526 (the percent of the url passed to Jsoup is unnecessarily encoded again).

A workaround for this issue is to downgrade to 1.9.2 where the encodeUrl method was implemented differently (see below)

1.10.2:

private static String encodeUrl(String url) {
        try {
            URL u = new URL(url);
            return encodeUrl(u).toExternalForm();
        } catch (Exception e) {
            return url;
        }
}

1.9.2:

private static String encodeUrl(String url) {
	if(url == null)
		return null;
	return url.replaceAll(" ", "%20");
}

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
jhycommented, Jun 11, 2017

jsoup 1.10.3 is out now: https://jsoup.org/news/release-1.10.3

0reactions
jhycommented, Jun 10, 2017

Sorry about the issues here. This is fixed in 1.10.3 (upcoming). I fixed it back in 56a728d482ce0c88cd9513eca1e7c7585f0718f9

Will close when 1.10.3 is released

Read more comments on GitHub >

github_iconTop Results From Across the Web

Double Encoding | OWASP Foundation
A double encoded URL can be used to perform an XSS attack in order to bypass a built-in XSS detection module. Depending on...
Read more >
Double URL Encoding | Imperva
SecureSphere has detected an HTTP request that has at least one double URL encoded character in it. Detailed Description. UTF-8 is a popular...
Read more >
Using Double URL Encoding to Bypass Security Mechanisms ...
In this video we answer the question 'what is double encoding '. We then use double URL encoding to bypass the security mechanims...
Read more >
Double encoding | Mastering Modern Web Penetration Testing
Double percent encoding is the same as percent encoding with a twist that each character is encoded twice instead of once. This technique...
Read more >
Troubleshooting Double Encoding issues - Jetpack
This may happen if your site uses HTTPS, and if your hosting company is “double-encoding” some of the values being passed. To test...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found