question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HtmlUnit returns UnexpectedPage when Content-Type header is missing

See original GitHub issue

We have a couple of system tests using Selenium/HtmlUnit in a legacy system using Java 7. Due to version restriction, in this tests I use HtmlUnit version 2.18 with Selenium 2.52.

Now we are migrating to Java 11, and we upgraded to HtmlUnit version 2.50 and Selenium 3.141.59, and we found an issue with one of the tests.

The problems seems to be that the response of one external service that we use doesn’t contain the Content-Type header. Here is the log with the response headers:

DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "HTTP/1.1 200 OK[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "Date: Fri, 11 Jun 2021 12:46:05 GMT[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "Server: Apache[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "X-XSS-Protection: 1; mode=block[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "X-FRAME-OPTIONS: ALLOW-FROM http://intranet.pre.cdti.es[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "Keep-Alive: timeout=3600, max=800[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "Connection: Keep-Alive[\r][\n]"
DEBUG [pool-1-thread-1] (Wire.java 73) - http-outgoing-1 << "Transfer-Encoding: chunked[\r][\n]"

This leads to HtmlUnit creating an UnexpectedPage and unloading the DOM:

DEBUG [pool-1-thread-1] (HtmlPage.java 1254) - Firing Event unload (Current Target: com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument@120eb694);
DEBUG [pool-1-thread-1] (WebWindowImpl.java 131) - setEnclosedPage: com.gargoylesoftware.htmlunit.UnexpectedPage@2bf01f1
DEBUG [pool-1-thread-1] (WebWindowImpl.java 213) - destroyChildren

It seems that with HtmlUnit 2.18, if Content-Type is missing, the default text/html is assumed. This is the same behaviour of the browsers I tested (Firefox, Chrome).

Is this an intended behaviour? Is there any workaround to assume text/html on missing Content-Type?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
rbricommented, Jun 14, 2021

As usual i will inform via twitter about new snapshots and releases

1reaction
rbricommented, Jun 14, 2021

Have written a test case for this, again many thanks for pointing this out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Changes - HtmlUnit - SourceForge
Page type detection was wrong if no content type header was set and some unicode chars are present in the first bytes of...
Read more >
HtmlUnit does not return all headers - Stack Overflow
1 Answer 1 ... As you can see, the Content-Type header is there. Can you confirm the server is actually sending that piece...
Read more >
htmlunit/WebClient.java at master - GitHub
HtmlUnit is a "GUI-Less browser for Java programs". - htmlunit/WebClient.java ... to customize the type of page that is returned for a given...
Read more >
Index (HtmlUnit 2.50.0 API) - javadoc.io
Returns a normalized textual representation of this element that represents what would be visible to the user if this page was shown in...
Read more >
com.gargoylesoftware.htmlunit.WebClient.java Source code
Here is the source code for com.gargoylesoftware.htmlunit. ... this if you want * to customize the type of page that is returned for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found