Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New audit proposal (SEO): Anchor href crawlability

See original GitHub issue

Hey LH team! Would like to make proposal for a core Lighthouse SEO audit.

Provide a basic description of the audit

The Anchor href audit asserts that hyperlinks are crawlable. This audit falls within the SEO category. This audit would not ping the target link to check it’s up.

How would the audit appear in the report?

This audit would be part of the SEO category with some sort of message to indicate links are crawlable, or they are not. When in DevTools, we can link to the failing anchor element.

How is this audit different from existing ones?

There is a link text audit which is about the descriptiveness of link text. But this is more about checking the anchor is crawlable from an SEO perspective.

What % of developers/pages will this impact?

From some initial checks, seems most of the popular websites do have anchor tags with href="#" or some sort of javascript: href and this audit may impact them. We would like to go through some of these cases and understand the reasoning behind them, e.g. developer convenience, technical limitations, and what some potential remedies could be e.g. better documentation/evangelism, outreach, a more relaxed audit.

How is the new audit making a better web for end users?

Search engine crawlers help users find what they’re looking for. Flagging to website owners that their links cannot be crawled may lead to fixes and thus improved search engine results for end-users.

What is the resourcing situation?

Me (@umaar) to develop the audit, @AVGP on docs.

Any other links or documentation that we should check out?

https://support.google.com/webmasters/answer/9112205?hl=en https://moz.com/learn/seo/anchor-text Looks like there’s already a AnchorElements gatherer, so that should be perfect for this!

What do you think?

Issue Analytics

State:
Created 3 years ago
Comments:13 (4 by maintainers)

Top GitHub Comments

3reactions

AVGPcommented, Jul 20, 2020

Interesting case.

On one hand, the warning is accurate b/c that invariably is an uncrawlable link. Yet, the nofollow tells us that you don’t care about this link wrt crawling. I think filtering nofollow links from the audit makes sense and I am sorry I missed that in the original spec for the audit. 🙏

On Mon, Jul 20, 2020, 23:22 Miguel Valenzuela notifications@github.com wrote:

Sure but, shouldn’t we be better off avoiding the warning altogether. Adding it to the exception list.

On Mon, Jul 20, 2020 at 3:22 AM Umar Hansa notifications@github.com wrote:

@decimoseptimo https://github.com/decimoseptimo oh could they add a href? Maybe like page 2

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/GoogleChrome/lighthouse/issues/10590#issuecomment-660941705 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AEAXN4YLGOQW6YBOUCGEDLLR4QLFXANCNFSM4MI3JGKA

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleChrome/lighthouse/issues/10590#issuecomment-661339969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC2MRUND4AIQDTSYUX23DTR4SYPVANCNFSM4MI3JGKA .

3reactions

umaarcommented, Apr 16, 2020

Looks like axe had the rule but then removed it, also see the corresponding docs for the href-no-hash rule.

I’m not sure on exactly what can be crawled and what cannot, but in the meantime here’s a gist of potentially non-crawlable anchors from popular websites. Looks like there are some of the following:

href="#"
no href but then a role="button" which gets intercepted by JS
<a id="top"></a>
<a name="top"></a>
href="javascript:void(0)"
no href but onclick="remove()"
no href but ng-click="remove()"
href="javascript:;"

That’s the “what”. Skimming through the gist should give us a better clue as to “why”.

With JS frameworks, the vibe I get is that they’ll support outputting regular hyperlinks, but sometimes conventions emerge which do something different.

Thoughts on starting out with an audit which only checks for a missing/empty href? And then we could tweak it when we learn what the crawler actually does. Can also do any more research we think would be useful!