Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add an overload to ParseDocument(Stream stream, Encoding encoding) to specify an encoding

See original GitHub issue

Is there a way to load a HtmlDocument from a byte array / a stream specifying which character encoding to use? We currently are loading the HTML document this way:

Encoding encoding = ...
using (MemoryStream ms = new MemoryStream(bytes))
{
	this.document = parser.ParseDocument(ms); // I cannot specify the given encoding
}

If I understood correctly, AngleSharp tries to detect the proper encoding, but encoding could be specified in various ways (HTTP headers, Byte Order Mark, meta content-type and meta charset), so at least the HTTP header case cannot be known by the parser.

We are trying to move away from HtmlAgilityPack, which in its equivalent class has an overload to specify the encoding to use. In our program byte array can be read from the web, or be loaded from a local DB. In most cases we already have the encoding, plus we wish to permit users to visualize HTML content with a different encoding.

Does AngleSharp already provide a way to do it?

If not, can I request an implementation? Can you add an overload: HtmlParser.ParseDocument(Stream stream, Encoding encoding) to specify an encoding

If the current method tries to detect the encoding, it would mean also a performance improvement when the encoding is already know.

Thank you