question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Do not add pages marked with `noindex` to sitemap

See original GitHub issue

Is your feature request related to a problem? Please describe. Pages with <meta content="noindex, follow" name="robots" /> should not be added to the sitemap. I am not sure about how this library works, but I don’t think it actually reads the content of the files so it may be hard to detect that meta tag.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:6

github_iconTop GitHub Comments

5reactions
gabrielreisncommented, Oct 16, 2021

hey, in case someone is still missing this, the solution I’m currently using within my team is a custom transform function. If that’s ok I can open a PR with the fix

transform: async (config, path) => {
    const noIndexRegex = /<meta.*noindex/gim
    const basePath = '.next/serverless/pages'
    const filePath = `${basePath + path}.html`

    if (fs.existsSync(filePath)) {
      try {
        const data = await fs.promises.readFile(filePath, 'utf8')

        if (data.match(noIndexRegex)) {
          console.log('ignored file:', filePath)

          return null
        }
      } catch (error) {
        console.error('err', error)
      }
    }

    return {
      loc: path,
      changefreq: config.changefreq,
      priority: config.priority,
      lastmod: config.autoLastmod ? new Date().toISOString() : undefined,
      alternateRefs: config.alternateRefs || [],
    }
  },
3reactions
rserafimcommented, Nov 5, 2021

I remove like this

module.exports = { siteUrl: ‘https://www.xxxx, exclude: [’/aaa/', ‘/xxx’, ‘/yyyy’], // <= exclude here

Read more comments on GitHub >

github_iconTop Results From Across the Web

Noindex URL in XML Sitemaps - Sitebulb
Your XML Sitemap should only contain URLs you wish for search engines to index. If a URL is noindex, this is an explicit...
Read more >
How to fix “Submitted URL marked 'noindex' - ContentKing
Pages are correctly marked noindex , and incorrectly included in the XML sitemap: remove these pages from the XML sitemap.
Read more >
How to Fix Submitted URL Marked 'NoIndex' Error? - Rank Math
In this knowledgebase article, we'll discuss how to fix the Submitted URL marked 'noindex' error that appears in the Google Search Console.
Read more >
Include or exclude noindex urls in sitemap? | SEO Forum - Moz
We just added tags to our pages with thin content. Should we include or exclude those urls from our sitemap.xml file? I've read...
Read more >
Cart, Checkout, My account marked noindex, How to fix?
When you set a particular post or page to 'noindex' and not to show in the search results, it will automatically be removed...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found