Default Accepts header
See original GitHub issueNew Feature Proposal
Description
Add a default Accept: text/html header to the HttpRequester created by Configuration.Default.WithDefaultLoader.
Background
I’m very new to AngleSharp so be kind if this is common sense and understood by everyone else in the community. Came across it this morning when I came to a conclusion that regex used until now was too brittle. Tried to use examples found online, such as:
var context = BrowsingContext.New(Configuration.Default.WithDefaultLoader();
var document = context.OpenAsync(url).Result;
var divs = document.QuerySelectorAll("div");
foreach (var div in divs)
{
// do something
}
To my surprise, this code would find no divs at all. Turned out that all HTML content was wrapped in another HTML document and a pre tag, which I traced to a commit related to #331. (And sure enough execution was breakpointing in HtmlDocument.LoadTextAsync.)
Fiddler showed that indeed response was being sent with Content-Type: text/plain;charset=UTF-8 instead of the expected Content-Type: text/html;charset=UTF-8. Adding code adapted from #367:
var requester = new HttpRequester();
requester.Headers["Accept"] = "text/html";
var context = BrowsingContext.New(Configuration.Default.WithDefaultLoader(requesters: new[] { requester }));
…solved the problem. Not sure what web server is being used on the other side (trying to load this url but I suppose loading a web page and doing some manipulation might be a common scenario.
Because at least some web servers will return a text (rather than HTML) response if given no Accept header, I would like to propose either adding a default Accept header or metioning this in docs / quick guides to save newcomers an hour of figuring out what is happening and why.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)

Top Related StackOverflow Question
@FlorianRappl thank you for looking into this and incredibly quick turnaround! Should have a few minutes this evening to test the change, will report back.
Have just checked out the devel branch and can confirm it works as expected. Was a joy to see the new user-agent thrown in the mix too. Thanks for making this happen and fingers crossed for v0.14 going into prod soon!
EDIT: Wait a sec, v0.14 is already in prod. That makes my day even more betterer 😉