IHtmlDocument.Title results in StackOverflow
See original GitHub issueBug Report
Prerequisites
- Can you reproduce the problem in a MWE?
- Are you running the latest version of AngleSharp? Yes, 0.14.0, also tried 1.0.0-alpha-844
- Did you check the FAQs to see if that helps you?
- Are you reporting to the correct repository? (there are multiple AngleSharp libraries, e.g.,
AngleSharp.Css
for CSS support) - Did you perform a search in the issues?
For more information, see the CONTRIBUTING
guide.
Description
- Cannot get Title
Yes, I known the input is not HTML, but it should not results in StackOverflow.
Steps to Reproduce
HttpClient httpClient = new HttpClient();
string html = await httpClient.GetStringAsync("http://162.212.178.138:8080/18/glibc-2.14/localedata/charmaps/GBK");
HtmlParser HtmlParser = new HtmlParser();
IHtmlDocument htmlDocument = await HtmlParser.ParseDocumentAsync(html);
Console.WriteLine($"htmlDocument.Title: {htmlDocument.Title}");
Expected behavior:
That is results in an empty string / Title.
Actual behavior:
Stack overflow.
Repeat 16127 times:
--------------------------------
at AngleSharp.Dom.NodeExtensions.FindDescendant[[System.__Canon, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](AngleSharp.Dom.INode)
--------------------------------
at AngleSharp.Html.Dom.HtmlDocument.GetTitle()
at OpenDirectoryDownloader.DirectoryParser+<ParseHtml>d__2.MoveNext()
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[OpenDirectoryDownloader.DirectoryParser+<ParseHtml>d__2, OpenDirectoryDownloader, Version=1.9.3.8, Culture=neutral, PublicKeyToken=null]](<ParseHtml>d__2 ByRef)
at OpenDirectoryDownloader.DirectoryParser.ParseHtml(OpenDirectoryDownloader.Shared.Models.WebDirectory, System.String, System.Net.Http.HttpClient)
at OpenDirectoryDownloader.OpenDirectoryIndexer+<ProcessWebDirectoryAsync>d__56.MoveNext()
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[OpenDirectoryDownloader.OpenDirectoryIndexer+<ProcessWebDirectoryAsync>d__56, OpenDirectoryDownloader, Version=1.9.3.8, Culture=neutral, PublicKeyToken=null]].MoveNext(System.Threading.Thread)
at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Runtime.CompilerServices.IAsyncStateMachineBox, Boolean)
at System.Threading.Tasks.Task.RunContinuations(System.Object)
Environment details:
Windows 10 Pro 20H2 .NET 5
Possible Solution
Maybe it’s heisenberg again, just kidding 😂 (See #893), sorry, have no idea why it get’s in a StackOverflow.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
How to get the title of HTML page with JavaScript?
This answer says that document.title returns the HTML title element, but as the code snippet shows, it does not do that. It returns...
Read more >How do I write a good title?
To be clear, I think it is fine to duplicate the tags in the title, but only when they can be worked into...
Read more >What I wish I had known about single page applications
Every response back from the server is the full HTML document ... The preview has a title, a line or two of descriptive...
Read more >Scraping a website HTML in VBA
Navigates to the StackOverflow home page. Waits until the home page has loaded. Loads up an HTML document, and shows its text.
Read more >The Document Title element - HTML - MDN Web Docs - Mozilla
The content of the title is one of the components used by search engine algorithms to decide the order in which to list...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Just want to confirm this is fixed. 👍
Sorry for the late reply. I now started looking into this.
This is indeed some bug in the DOM builder. For some reason it creates a cyclic structure. The issue is not constraint to the get title.
I’ll try to issue a fix this week. Thanks! 🍻