createPages uses stale cached data for "previous" and "next" pages
See original GitHub issueDescription
I noticed this on my own blog’s “previous post” and “next post” links, and it’s reproducible on gatsby-starter-blog
. It may have broader implications beyond my use case as well.
If you have links to “previous” and “next” posts on each blog post’s page, then those are not updated on each build. As you add posts, they remember their cached “previous” link, but are never updated with the “next.” If a post is removed, then existing “previous/next” links are preserved. (And its page will still exist in the build.)
Steps to reproduce
I made a sample repo (almost identical to gatsby-starter-blog
): https://github.com/johnridesabike/gatsby-prev-next
git clone https://github.com/johnridesabike/gatsby-prev-next.git
cd gatsby-prev-next
gatsby build
If you serve the built site, everything should be normal. The next part is where the problem happens:
mv test-post content/blog/test-post # add a new post
gatsby build
gatsby serve
Expected result
The build should use up-to-date data in regards to “previous” and “next” links.
Actual result
If you open the new built site, the new post (“Test Post”) should be present. The previous post (“New Beginnings”) does not link to it. Clearing the cache fixes the problem:
gatsby clean
gatsby build
gatsby serve
The site should build as expected.
But now, if you remove the content/blog/test-post
directory, and run gatsby build
, “Test Post” won’t appear on the index page but the “New Beginnings” post will still link to it (and the “Test Post” page will still exist in the build.)
Additional information
I played around a bit trying to figure out how to bust the stale data, but with no luck. One thing I noticed is that the createPages
function uses this logic to generate the previous
and next
data:
const previous = index === posts.length - 1 ? null : posts[index + 1].node
const next = index === 0 ? null : posts[index - 1].node
I tried to update it to use a GraphQL query instead:
{
allMarkdownRemark(sort: {fields: [frontmatter___date], order: DESC}, limit: 1000) {
edges {
node {
fields {
slug
}
frontmatter {
title
}
}
+ previous {
+ fields {
+ slug
+ }
+ frontmatter {
+ title
+ }
+ }
+ next {
+ fields {
+ slug
+ }
+ frontmatter {
+ title
+ }
+ }
}
}
}
This had no effect. The stale cache was still used.
If it’s not feasible to make the caching “smart” enough for this scenario, it would be also good if createPages
had the ability to manually use or override the cache, as well as delete cached pages after their source data has been deleted.
Environment
System:
OS: macOS 10.15.6
CPU: (6) x64 Intel(R) Core(TM) i5-8500B CPU @ 3.00GHz
Shell: 5.7.1 - /bin/zsh
Binaries:
Node: 14.8.0 - /usr/local/bin/node
Yarn: 1.22.4 - /usr/local/bin/yarn
npm: 6.14.7 - /usr/local/bin/npm
Languages:
Python: 2.7.17 - /usr/local/bin/python
Browsers:
Chrome: 84.0.4147.125
Firefox: 79.0
Safari: 13.1.2
npmPackages:
gatsby: ^2.24.41 => 2.24.41
gatsby-image: ^2.4.15 => 2.4.15
gatsby-plugin-feed: ^2.5.11 => 2.5.11
gatsby-plugin-google-analytics: ^2.3.13 => 2.3.13
gatsby-plugin-manifest: ^2.4.22 => 2.4.22
gatsby-plugin-offline: ^3.2.23 => 3.2.23
gatsby-plugin-react-helmet: ^3.3.10 => 3.3.10
gatsby-plugin-sharp: ^2.6.26 => 2.6.26
gatsby-plugin-typography: ^2.5.10 => 2.5.10
gatsby-remark-copy-linked-files: ^2.3.12 => 2.3.12
gatsby-remark-images: ^3.3.25 => 3.3.25
gatsby-remark-prismjs: ^3.5.10 => 3.5.10
gatsby-remark-responsive-iframe: ^2.4.12 => 2.4.12
gatsby-remark-smartypants: ^2.3.10 => 2.3.10
gatsby-source-filesystem: ^2.3.24 => 2.3.24
gatsby-transformer-remark: ^2.8.28 => 2.8.28
gatsby-transformer-sharp: ^2.5.12 => 2.5.12
npmGlobalPackages:
gatsby-cli: 2.12.66
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:13 (3 by maintainers)
Top GitHub Comments
I was focused on the previous/next links when I submitted this issue, but looking at it now I realize that the much more critical bug is probably the fact that deleted markdown pages are not deleted from builds (when the cache is used).
Hey @pieh thanks for your workaround approch, but i think this wont work for some cases. For our (company) page, we pass all of our data through the page context. (Yes I’m aware of the limitations and risks involved with that) All pages build fine, because new/changed data is queried in the gatsby-node.js, but all data passed through page context will be stale in the subsequent build, if you dont delete the cache first. What also works is, if you go into the .cache folder and delete the redux folder manually and keep the rest in place. This will also yield fresh page context data. That’s why I fear, that your workaround might no to be sufficient for all users, since graphql might not even be the root cause of this, but rather stale, redux-stored data…
We’re building on Netlify and have found our own workaround:
echo "// ${BUILD_ID}" >> ./gatsby-config.js
This way Gatsby will recognize a change in the config-file and delete its cache. But since we’re using the netlify cache, we can at least persist all image data and therefor still build relatively fast. Any other optimization regarding caching will be lost tho. Meaning netlify will upload a lot of new files with every new build making the gatsby-remove-fingerprints-plugin more or less obsolete for instance.