Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Sitemaps] Parser assumes file encoding is the same as the one used by default on JVM

See original GitHub issue
SiteMapParser creates Readers from the byte[] content by assuming that the file 
encoding is the same as the one use by default on the JVM. 

We should use Tika's charset detector (or ICU4J directly) to detect the charset 
used prior to creating the Readers.

Will attach a test class shortly.

Original issue reported on by digitalpebble on 19 Jan 2015 at 4:51

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:13 (5 by maintainers)

github_iconTop GitHub Comments

jniochecommented, Mar 20, 2017

@kkrugler this has been covered elsewhere indeed.

kkruglercommented, Feb 2, 2017

@jnioche - since you originally opened this issue, can you confirm what Sebastian is seeing and then close it if this has been subsumed by #67 and #137? Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

What is the default encoding of the JVM? - java - Stack Overflow
UTF-16 is how text is represented internally in the JVM. The default encoding determines how the JVM interprets bytes read from files (using...
Read more >
How to set Encoding in Java | Edureka Community
When you are encoding or decoding, you can query the file.encoding property or Charset.defaultCharset() to find the current default encoding, ...
Read more >
Resolved Problems - Oracle Help Center
1 specification, which states that if a request/response contains both a Content-Length header as well as a Transfer-Encoding: Chunked header, the Content- ...
Read more >
STR04-J. Use compatible character encodings when ...
Compatible encodings must be used when characters are output as an array of bytes then input by another JVM and subsequently converted back...
Read more >
NBT format - Minecraft Wiki - Fandom
The Named Binary Tag (NBT) is a tree data structure used by Minecraft in many save files to store arbitrary data. The format...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Post

No results found

github_iconTop Related Hashnode Post

No results found