Robots parser to always handle absolute sitemap URL even without valid base URL
See original GitHub issueI get an Invalid URL with sitemap directive
for a sitemap url which is not relative.
I noticed in the implementation here that it only handles relative URLs, which might not be the case always, as now it is preferred to have the absolute URL for the sitemap in robots.txt
.
Therefore if there is a URL provided with the sitemap, we cannot access it.
Issue Analytics
- State:
- Created 4 years ago
- Comments:12 (7 by maintainers)
Top Results From Across the Web
How Google Interprets the robots.txt Specification
The [absoluteURL] line points to the location of a sitemap or sitemap index file. It must be a fully qualified URL, including the...
Read more >1.1 released! - Google Groups
[Robots] Robots parser to always handle absolute sitemap URL even without valid base URL (pr3mar, kkrugler, sebastian-nagel) #240
Read more >Robots.txt for SEO: The Ultimate Guide
Learn how to help search engines crawl your website more efficiently using the robots.txt file to achieve a better SEO performance.
Read more >Can a relative sitemap url be used in a robots.txt?
@Shams: The URLs listed in your sitemap have to use the same protocol and the same host as the sitemap file. If your...
Read more >Manage your sitemaps using the Sitemaps report - Google Help
If it is not, the test should show why Google can't reach or index the page (common reasons: a robots.txt rule; an incorrect...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Happy to be of service, you helped me a lot to build my own parser! I’d be honoured if you took a look at it, it’s here.
Nice project! And another one in the long list of crawler/web-crawler/spider projects.