question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

createPages uses stale cached data for "previous" and "next" pages

See original GitHub issue

Description

I noticed this on my own blog’s “previous post” and “next post” links, and it’s reproducible on gatsby-starter-blog. It may have broader implications beyond my use case as well.

If you have links to “previous” and “next” posts on each blog post’s page, then those are not updated on each build. As you add posts, they remember their cached “previous” link, but are never updated with the “next.” If a post is removed, then existing “previous/next” links are preserved. (And its page will still exist in the build.)

Steps to reproduce

I made a sample repo (almost identical to gatsby-starter-blog): https://github.com/johnridesabike/gatsby-prev-next

git clone https://github.com/johnridesabike/gatsby-prev-next.git
cd gatsby-prev-next
gatsby build

If you serve the built site, everything should be normal. The next part is where the problem happens:

mv test-post content/blog/test-post # add a new post
gatsby build
gatsby serve

Expected result

The build should use up-to-date data in regards to “previous” and “next” links.

Actual result

If you open the new built site, the new post (“Test Post”) should be present. The previous post (“New Beginnings”) does not link to it. Clearing the cache fixes the problem:

gatsby clean
gatsby build
gatsby serve

The site should build as expected.

But now, if you remove the content/blog/test-post directory, and run gatsby build, “Test Post” won’t appear on the index page but the “New Beginnings” post will still link to it (and the “Test Post” page will still exist in the build.)

Additional information

I played around a bit trying to figure out how to bust the stale data, but with no luck. One thing I noticed is that the createPages function uses this logic to generate the previous and next data:

const previous = index === posts.length - 1 ? null : posts[index + 1].node
const next = index === 0 ? null : posts[index - 1].node

I tried to update it to use a GraphQL query instead:

{
  allMarkdownRemark(sort: {fields: [frontmatter___date], order: DESC}, limit: 1000) {
    edges {
      node {
        fields {
          slug
        }
        frontmatter {
          title
        }
      }
+      previous {
+        fields {
+          slug
+        }
+        frontmatter {
+          title
+        }
+      }
+      next {
+        fields {
+          slug
+        }
+        frontmatter {
+          title
+        }
+      }
    }
  }
}

This had no effect. The stale cache was still used.

If it’s not feasible to make the caching “smart” enough for this scenario, it would be also good if createPages had the ability to manually use or override the cache, as well as delete cached pages after their source data has been deleted.

Environment

  System:
    OS: macOS 10.15.6
    CPU: (6) x64 Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz
    Shell: 5.7.1 - /bin/zsh
  Binaries:
    Node: 14.8.0 - /usr/local/bin/node
    Yarn: 1.22.4 - /usr/local/bin/yarn
    npm: 6.14.7 - /usr/local/bin/npm
  Languages:
    Python: 2.7.17 - /usr/local/bin/python
  Browsers:
    Chrome: 84.0.4147.125
    Firefox: 79.0
    Safari: 13.1.2
  npmPackages:
    gatsby: ^2.24.41 => 2.24.41 
    gatsby-image: ^2.4.15 => 2.4.15 
    gatsby-plugin-feed: ^2.5.11 => 2.5.11 
    gatsby-plugin-google-analytics: ^2.3.13 => 2.3.13 
    gatsby-plugin-manifest: ^2.4.22 => 2.4.22 
    gatsby-plugin-offline: ^3.2.23 => 3.2.23 
    gatsby-plugin-react-helmet: ^3.3.10 => 3.3.10 
    gatsby-plugin-sharp: ^2.6.26 => 2.6.26 
    gatsby-plugin-typography: ^2.5.10 => 2.5.10 
    gatsby-remark-copy-linked-files: ^2.3.12 => 2.3.12 
    gatsby-remark-images: ^3.3.25 => 3.3.25 
    gatsby-remark-prismjs: ^3.5.10 => 3.5.10 
    gatsby-remark-responsive-iframe: ^2.4.12 => 2.4.12 
    gatsby-remark-smartypants: ^2.3.10 => 2.3.10 
    gatsby-source-filesystem: ^2.3.24 => 2.3.24 
    gatsby-transformer-remark: ^2.8.28 => 2.8.28 
    gatsby-transformer-sharp: ^2.5.12 => 2.5.12 
  npmGlobalPackages:
    gatsby-cli: 2.12.66

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:13 (3 by maintainers)

github_iconTop GitHub Comments

4reactions
johnridesabikecommented, Aug 18, 2020

I was focused on the previous/next links when I submitted this issue, but looking at it now I realize that the much more critical bug is probably the fact that deleted markdown pages are not deleted from builds (when the cache is used).

2reactions
f-kottekcommented, Apr 14, 2021

Hey @pieh thanks for your workaround approch, but i think this wont work for some cases. For our (company) page, we pass all of our data through the page context. (Yes I’m aware of the limitations and risks involved with that) All pages build fine, because new/changed data is queried in the gatsby-node.js, but all data passed through page context will be stale in the subsequent build, if you dont delete the cache first. What also works is, if you go into the .cache folder and delete the redux folder manually and keep the rest in place. This will also yield fresh page context data. That’s why I fear, that your workaround might no to be sufficient for all users, since graphql might not even be the root cause of this, but rather stale, redux-stored data…

We’re building on Netlify and have found our own workaround:

  • use the netlify-plugin-gatsby-cache
  • create a “before_script.sh” and put it into your build command line,
  • put this into the script: echo "// ${BUILD_ID}" >> ./gatsby-config.js

This way Gatsby will recognize a change in the config-file and delete its cache. But since we’re using the netlify cache, we can at least persist all image data and therefor still build relatively fast. Any other optimization regarding caching will be lost tho. Meaning netlify will upload a lot of new files with every new build making the gatsby-remove-fingerprints-plugin more or less obsolete for instance.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How I broke Gatsby JS conditional page build and learned to ...
As a precaution, we're deleting your site's cache to ensure there's no stale data. Rewriting compilation hashes; Regenerating Images success ...
Read more >
Staleness of Data - Relay
By default, Relay will not consider data in the store to be stale (regardless of how long it has been in the cache),...
Read more >
An Honest Review of Gatsby - Hacker News
This review nails it. It takes the matrix.org gatsby website 20 minutes to build currently, which is excruciating when trying to do quick ......
Read more >
Caching Data with the ObjectDataSource (C#) - Microsoft Learn
Since the cache holds just a copy of the actual, underlying data, it can become outdated, or stale, if the underlying data changes....
Read more >
Gatsby Node APIs
Documentation on Node APIs used in Gatsby build process for common uses like ... You can use its APIs to create pages dynamically,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found