New audit proposal (SEO): Anchor href crawlability
See original GitHub issueHey LH team! Would like to make proposal for a core Lighthouse SEO audit.
Provide a basic description of the audit
The Anchor href audit asserts that hyperlinks are crawlable. This audit falls within the SEO category. This audit would not ping the target link to check it’s up.
How would the audit appear in the report?
This audit would be part of the SEO category with some sort of message to indicate links are crawlable, or they are not. When in DevTools, we can link to the failing anchor element.
How is this audit different from existing ones?
There is a link text audit which is about the descriptiveness of link text. But this is more about checking the anchor is crawlable from an SEO perspective.
What % of developers/pages will this impact?
From some initial checks, seems most of the popular websites do have anchor tags with href="#"
or some sort of javascript:
href and this audit may impact them. We would like to go through some of these cases and understand the reasoning behind them, e.g. developer convenience, technical limitations, and what some potential remedies could be e.g. better documentation/evangelism, outreach, a more relaxed audit.
How is the new audit making a better web for end users?
Search engine crawlers help users find what they’re looking for. Flagging to website owners that their links cannot be crawled may lead to fixes and thus improved search engine results for end-users.
What is the resourcing situation?
Me (@umaar) to develop the audit, @AVGP on docs.
Any other links or documentation that we should check out?
https://support.google.com/webmasters/answer/9112205?hl=en
https://moz.com/learn/seo/anchor-text
Looks like there’s already a AnchorElements
gatherer, so that should be perfect for this!
What do you think?
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (4 by maintainers)
Top GitHub Comments
Interesting case.
On one hand, the warning is accurate b/c that invariably is an uncrawlable link. Yet, the nofollow tells us that you don’t care about this link wrt crawling. I think filtering nofollow links from the audit makes sense and I am sorry I missed that in the original spec for the audit. 🙏
On Mon, Jul 20, 2020, 23:22 Miguel Valenzuela notifications@github.com wrote:
Looks like axe had the rule but then removed it, also see the corresponding docs for the
href-no-hash
rule.I’m not sure on exactly what can be crawled and what cannot, but in the meantime here’s a gist of potentially non-crawlable anchors from popular websites. Looks like there are some of the following:
href="#"
role="button"
which gets intercepted by JS<a id="top"></a>
<a name="top"></a>
href="javascript:void(0)"
onclick="remove()"
ng-click="remove()"
href="javascript:;"
That’s the “what”. Skimming through the gist should give us a better clue as to “why”.
With JS frameworks, the vibe I get is that they’ll support outputting regular hyperlinks, but sometimes conventions emerge which do something different.
Thoughts on starting out with an audit which only checks for a missing/empty
href
? And then we could tweak it when we learn what the crawler actually does. Can also do any more research we think would be useful!