Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Robots parser to always handle absolute sitemap URL even without valid base URL

See original GitHub issue

I get an Invalid URL with sitemap directive for a sitemap url which is not relative.

I noticed in the implementation here that it only handles relative URLs, which might not be the case always, as now it is preferred to have the absolute URL for the sitemap in robots.txt.

Therefore if there is a URL provided with the sitemap, we cannot access it.

Issue Analytics

State:
Created 4 years ago
Comments:12 (7 by maintainers)

Top GitHub Comments

1reaction

pr3marcommented, Apr 4, 2019

Happy to be of service, you helped me a lot to build my own parser! I’d be honoured if you took a look at it, it’s here.

0reactions

sebastian-nagelcommented, Apr 9, 2019

Nice project! And another one in the long list of crawler/web-crawler/spider projects.

Read more comments on GitHub >

Top Results From Across the Web

How Google Interprets the robots.txt Specification

The [absoluteURL] line points to the location of a sitemap or sitemap index file. It must be a fully qualified URL, including the...

1.1 released! - Google Groups

[Robots] Robots parser to always handle absolute sitemap URL even without valid base URL (pr3mar, kkrugler, sebastian-nagel) #240

Robots.txt for SEO: The Ultimate Guide

Learn how to help search engines crawl your website more efficiently using the robots.txt file to achieve a better SEO performance.

Can a relative sitemap url be used in a robots.txt?

@Shams: The URLs listed in your sitemap have to use the same protocol and the same host as the sitemap file. If your...

Manage your sitemaps using the Sitemaps report - Google Help

If it is not, the test should show why Google can't reach or index the page (common reasons: a robots.txt rule; an incorrect...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Dependency towards Servlet API

[Sitemaps] Sitemap index: stop URL at closing </loc>