Render HTML from String
See original GitHub issueIs there a convenient way to pass a html string to htmlunit and get back the rendered pagesource? i am currently working around this by passing html string in data uri scheme as an url, which is working but not really convenient.
here is a little example test that on the one hand illustrates the described workaround:
@Test
fun `can render html string`() {
val someHtmlIncludingEs6Script = """
<!DOCTYPE html>
<html lang="en">
<head>
<title>i'm the title</title>
</head>
<body>
i'm the body
<h1>i'm the headline</h1>
<p>i'm a paragraph</p>
<p>i'm a second paragraph</p>
</body>
<script>
const getNodesOf = (selector) => document.querySelectorAll(selector);
getNodesOf("p").forEach(p => p.innerHTML = "<span>dynamically added</span>")
</script>
</html>
""".trimIndent()
val dataUriMimeType = "data:text/html;charset=UTF-8;"
val base64encoded = Base64.getEncoder().encodeToString(someHtmlIncludingEs6Script.toByteArray())
val dataUri = "${dataUriMimeType}base64,$base64encoded"
val client = WebClient(BrowserVersion.BEST_SUPPORTED)
val page: Page = client.getPage(dataUri)
val httpResponse = page.webResponse
val document = when {
page.isHtmlPage -> (page as HtmlPage).asXml()
else -> httpResponse.contentAsString
}
expectThat(document).isEqualTo("""
|<?xml version="1.0" encoding="UTF-8"?>
|<html lang="en">
| <head>
| <title>
| i'm the title
| </title>
| </head>
| <body>
|
| i'm the body
|
| <h1>
| i'm the headline
| </h1>
| <p>
| <span>
| dynamically added
| </span>
| </p>
| <p>
| <span>
| dynamically added
| </span>
| </p>
| <script>
|//<![CDATA[
|
| const getNodesOf = (selector) => document.querySelectorAll(selector);
| getNodesOf("p").forEach(p => p.innerHTML = "<span>dynamically added</span>")
|
|//]]>
| </script>
| </body>
|</html>
|
""".trimMargin())
}
as you can see all p
-tags text has been overwritten by javascript. great, exactly what i want.
❓ so whats my issue with this? --> an url will have a max length and if you can imagine a more complex html converted to a base64 data uri string can easily exceed this limit, thereby this solutions only works for “simple” websites.
💡 would you mind to add a feature that allows it to pass an html string to htmlunit and get rendered? maybe it is even already there and i just didn’t found it?
Issue Analytics
- State:
- Created 2 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
Maybe these methods (new in 2.48.0) are your friends
WebClient.loadHtmlCodeIntoCurrentWindow(String)
WebClient.loadXHtmlCodeIntoCurrentWindow(String).
Thanks, will close this.