question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ability to get index of an element inside the source code

See original GitHub issue

In HtmlAgilityPack it is possible to get start index of an element in the source code. Although it’s stored in the private fields but still usable.

I could not find anything that corresponds to this in AngleSharp. There is an Index method in NodeExtensions but it’s not what I’m looking for. I need the start index of the element in the source code, not in the parent collection. Is there any chance this support can be added?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
FlorianRapplcommented, Apr 28, 2019

This landed in devel. Excerpt from the documentation:


By default AngleSharp will throw away the “tokens” that associate the element with a position in the source code. This is mostly done due to the required memory consumption. The tag tokens transport not only the position, but also some additional fields like the name, flags and other meta information, as well as attributes. These tokens, however, can be preserved.

Currently, there are two ways to do this (both accessible via the HtmlParserOptions).

  1. For one-time scenarios during parsing the OnCreated callback can be used. The first argument is the IElement instance. The second argument received by the callback is a TextPosition value.
  2. For retrieval at a later point in time the IsKeepingSourceReferences option could be set to true. This way the SourceReference property of all parser-created IElement instances will be non-null. Currently, the referenced ISourceReference only contains a Position property.

In code for option 1 this looks as follows:

var bodyPos = TextPosition.Empty;
var parser = new HtmlParser(new HtmlParserOptions
{
    OnCreated = (IElement element, TextPosition position) =>
    {
        if (element.TagName == "BODY")
        {
            bodyPos = position;
        }
    },
});
var document = parser.ParseDocument("<!doctype html><body>");

The code for option 2 looks as follows:

var parser = new HtmlParser(new HtmlParserOptions
{
    IsKeepingSourceReferences = true,
});
var document = parser.ParseDocument("<!doctype html><body>");
var bodyPos = document.Body.SourceReference.Position;

In both cases the position we care about will be stored in bodyPos.

Remark: As SourceReference may be empty (e.g., when we omit the provided option or if we select an element that came in after parsing) we advise of using SourceReference?.Position, where we would end up with a Nullable<TextPosition>. Ideally, we then just use TextPosition.Empty as the fallback, e.g., in the code above:

var bodyPos = document.Body.SourceReference?.Position ?? TextPosition.Empty;

Hope this helps!

1reaction
selman92commented, Jan 31, 2019

Your assumption is correct, I meant the starting position of the element in the HTML source code:

Example: <html><p>some paragraph</p></html>

p.StartIndex should be 6.

This is a very useful feature we use in HtmlAgilityPack, we are thinking of porting existing code to use AngleSharp, but there are some missing features, this is one of them.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Get the html (source code) index of a selection?
In response to the question, no. The source HTML is used to create a document object, javascript operates on that object, not on...
Read more >
Python Index – How to Find the Index of an Element in a List
In this article, we'll go through how you can find the index of a particular element which is stored in a list in...
Read more >
Find the index of an array element in Java
Each utility class has an indexOf() method that returns the index of the first appearance of the element in array. Below is the...
Read more >
How to Find the Index of an Item in Python Lists
Learn how to find the index of an element in a Python list, both using built-in list method index and simple iterations.
Read more >
Python List index()
The index() method returns the index of the given element in the list. If the element is not found, a ValueError exception is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found