question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

New audit proposal (SEO): Anchor href crawlability

See original GitHub issue

Hey LH team! Would like to make proposal for a core Lighthouse SEO audit.

Provide a basic description of the audit

The Anchor href audit asserts that hyperlinks are crawlable. This audit falls within the SEO category. This audit would not ping the target link to check it’s up.

How would the audit appear in the report?

This audit would be part of the SEO category with some sort of message to indicate links are crawlable, or they are not. When in DevTools, we can link to the failing anchor element.

How is this audit different from existing ones?

There is a link text audit which is about the descriptiveness of link text. But this is more about checking the anchor is crawlable from an SEO perspective.

What % of developers/pages will this impact?

From some initial checks, seems most of the popular websites do have anchor tags with href="#" or some sort of javascript: href and this audit may impact them. We would like to go through some of these cases and understand the reasoning behind them, e.g. developer convenience, technical limitations, and what some potential remedies could be e.g. better documentation/evangelism, outreach, a more relaxed audit.

How is the new audit making a better web for end users?

Search engine crawlers help users find what they’re looking for. Flagging to website owners that their links cannot be crawled may lead to fixes and thus improved search engine results for end-users.

What is the resourcing situation?

Me (@umaar) to develop the audit, @AVGP on docs.

Any other links or documentation that we should check out?

https://support.google.com/webmasters/answer/9112205?hl=en https://moz.com/learn/seo/anchor-text Looks like there’s already a AnchorElements gatherer, so that should be perfect for this!

What do you think?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

3reactions
AVGPcommented, Jul 20, 2020

Interesting case.

On one hand, the warning is accurate b/c that invariably is an uncrawlable link. Yet, the nofollow tells us that you don’t care about this link wrt crawling. I think filtering nofollow links from the audit makes sense and I am sorry I missed that in the original spec for the audit. 🙏

On Mon, Jul 20, 2020, 23:22 Miguel Valenzuela notifications@github.com wrote:

Sure but, shouldn’t we be better off avoiding the warning altogether. Adding it to the exception list.

On Mon, Jul 20, 2020 at 3:22 AM Umar Hansa notifications@github.com wrote:

@decimoseptimo https://github.com/decimoseptimo oh could they add a href? Maybe like page 2

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/GoogleChrome/lighthouse/issues/10590#issuecomment-660941705 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AEAXN4YLGOQW6YBOUCGEDLLR4QLFXANCNFSM4MI3JGKA

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleChrome/lighthouse/issues/10590#issuecomment-661339969, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC2MRUND4AIQDTSYUX23DTR4SYPVANCNFSM4MI3JGKA .

3reactions
umaarcommented, Apr 16, 2020

Looks like axe had the rule but then removed it, also see the corresponding docs for the href-no-hash rule.

I’m not sure on exactly what can be crawled and what cannot, but in the meantime here’s a gist of potentially non-crawlable anchors from popular websites. Looks like there are some of the following:

  • href="#"
  • no href but then a role="button" which gets intercepted by JS
  • <a id="top"></a>
  • <a name="top"></a>
  • href="javascript:void(0)"
  • no href but onclick="remove()"
  • no href but ng-click="remove()"
  • href="javascript:;"

That’s the “what”. Skimming through the gist should give us a better clue as to “why”.

With JS frameworks, the vibe I get is that they’ll support outputting regular hyperlinks, but sometimes conventions emerge which do something different.

Thoughts on starting out with an audit which only checks for a missing/empty href? And then we could tweak it when we learn what the crawler actually does. Can also do any more research we think would be useful!

Read more comments on GitHub >

github_iconTop Results From Across the Web

10 Steps To Boost Your Site's Crawlability And Indexability
How To Improve Crawling And Indexing · 1. Improve Page Loading Speed · 2. Strengthen Internal Link Structure · 3. Submit Your Sitemap...
Read more >
Semrush Website Audit: How To Do It In 2023? (Detailed Guide)
Semrush bot crawls your site, analyzes key SEO matrices, and then creates a comprehensive report mentioning the vulnerabilities on your website ...
Read more >
How to Perform a Complete SEO Audit (in 15 Steps) - SEMrush
An SEO audit usually covers areas like: Indexing and crawlability; User experience; Site architecture; Competitor benchmarking; Keyword research ...
Read more >
Crawling & Site Audits | SEO Resources For All Skill Levels
Site audits are an essential tool to evaluate how easily a search engine can discover, crawl, and index everything from individual elements of...
Read more >
Technical SEO Crawlability Checklist - HubSpot Blog
Use this checklist to perform a technical SEO crawlability audit on ... The closer Page A is to your homepage, the more pages...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found