Build stuck at running jobs (image transformation)
See original GitHub issueIf you’re coming new to this issue, please see this first: https://github.com/gatsbyjs/gatsby/issues/34051#issuecomment-979882343
Preliminary Checks
- This issue is not a duplicate. Before opening a new issue, please search existing issues: https://github.com/gatsbyjs/gatsby/issues
- This issue is not a question, feature request, RFC, or anything other than a bug report directly related to Gatsby. Please post those things in GitHub Discussions: https://github.com/gatsbyjs/gatsby/discussions
Description
Gatsby’s build process is hanging and not completing. I suspect the issue is with Sharp, as my site has quite a few images, and I saw this brought up in a previous issue, #33557.
When I upgraded to v4 I initially had no issues. However, the next day my builds all started going exceeding Netlify’s maximum build time of 30 minutes.
I mentioned this problem in the thread to the other issue, as others apparently had the same problem where run queries in workers
seems to take longer than expected.
This issue is difficult to reproduce because I think in part it is to do with the scale of my site, which is moderately large and has ~1600 images. There must be something that isn’t quite right in the worker process because my builds on netlify went from roughly taking around 13 or 14 minutes, to exceeding the build limit every time.
To try and diagnose the issue I tried a local build, which while it took a long-ish time, did actually complete
Since @LekoArts suggested that Gatsby Cloud’s build process is better optimised for processing images, I thought I’d give that a go.
After trying out a build in Gatsby Cloud, I had no build problems at all and the whole site build with a clear cache in 7 minutes. OK, I thought, seems like the problem isn’t so much with Gatsby, but in how Netlify is interacting with v4’s worker process.
However, the next push I ran into the problem once again, this time in Gatsby Cloud. The bottom end of Gatsby Cloud’s logs are useful, because they give me a little more information than Netlify:
17:38:38 PM:
info Total nodes: 7987, SitePage nodes: 1695 (use --verbose for breakdown)
17:38:38 PM:
success Checking for changed pages - 0.001s
17:38:38 PM:
success onPreExtractQueries - 0.000s
17:38:38 PM:
success Cleaning up stale page-data - 0.024s
17:38:38 PM:
success createPages - 1.351s
17:38:40 PM:
success extract queries from components - 1.596s
17:38:40 PM:
success write out redirect data - 0.004s
17:38:40 PM:
success onPostBootstrap - 0.046s
17:38:40 PM:
success write out requires - 0.030s
17:38:40 PM:
info bootstrap finished - 48.635s
17:39:15 PM:
warning warn - You have enabled the JIT engine which is currently in preview.
17:39:15 PM:
warning warn - Preview features are not covered by semver, may introduce breaking changes, and can change at any time.
17:39:15 PM:
warning ⠀
17:39:22 PM:
success Building production JavaScript and CSS bundles - 42.093s
17:39:24 PM:
[webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)
17:39:24 PM:
[webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)
17:39:24 PM:
[webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)
17:39:59 PM:
success Building Rendering Engines - 37.719s
17:40:13 PM:
success Building HTML renderer - 13.051s
17:40:13 PM:
success Execute page configs - 0.039s
17:40:15 PM:
success Caching Webpack compilations - 0.001s
17:40:15 PM:
success Validating Rendering Engines - 2.094s
17:40:39 PM:
success run queries in workers - 23.276s - 1662/1662 71.40/s
17:45:38 PM:
warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):
17:45:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"
17:45:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:
17:45:38 PM:
Process will be terminated in 1500.000 seconds if nothing will change.
17:45:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"
18:10:38 PM:
ERROR Terminating the process (due to GATSBY_WATCHDOG_STUCK_STATUS_TIMEOUT):
18:10:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"
18:10:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 1800.000 seconds. Activities preventing Gatsby from transitioning to idle state:
18:10:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"
The fact that a full, uncached build on Gatsby Cloud can run in 7 minutes, suggests to me that actually the issue isn’t one of scale, but that the worker process is hanging, but only sometimes.
Is it to do with incremental builds? Maybe. I am using the preserved download cache, because as I said my site has quite a few images which are coming from a custom source plugin (which is relatively simple, and contains all the image links from AWS that are passed over to createRemoteFileNode).
To test things out once I had the first timeout on Gatsby Cloud, I tested a manual deploy without clearing the cache. I was hoping the process would hang again so I’d know the issue was with the cache and incremental builds, but alas, it did not. The whole build was completed in 6 minutes. Strangely, the issue does appear to occur on Netlify more frequently than not, and happens more occasionally in Gatsby Cloud. It may be to do with build process resources, because I just signed up to Gatsby Cloud, and so am in the free preview of performance builds.
Are there other diagnostic tools I can use to more closely inspect the build process? How would I be able to see which process is failing or never finishing?
Reproduction Link
I can’t seem to reproduce this error as it is intermittent
Steps to Reproduce
- Attempt to build site with
gatsby build
in either Netlify or Gatsby Cloud - Sometimes, the build never finishes
Expected Result
gatsby build
should eventually finish and build the site
Actual Result
The state run queries in workers
never finishes/moves on to merge worker state
, the build eventually times out and fails.
Environment
My local environment isn't really the issue, builds have failed in both Netlify and Gatsby Cloud with this problem.
However, this is my local env:
System:
OS: macOS Mojave 10.14.6
CPU: (4) x64 Intel(R) Core(TM) i7-4578U CPU @ 3.00GHz
Shell: 3.2.57 - /bin/bash
Binaries:
Node: 16.1.0 - /usr/local/bin/node
npm: 8.1.4 - /usr/local/bin/npm
Languages:
Python: 3.9.5 - /usr/local/opt/python/libexec/bin/python
Browsers:
Chrome: 95.0.4638.69
Firefox: 94.0.1
Safari: 14.1.2
npmPackages:
gatsby: ^4.1.6 => 4.2.0
gatsby-plugin-gdpr-cookies: ^2.0.8 => 2.0.8
gatsby-plugin-image: ^2.1.3 => 2.2.0
gatsby-plugin-loadable-components-ssr: ^4.1.0 => 4.1.0
gatsby-plugin-local-search: ^2.0.1 => 2.0.1
gatsby-plugin-netlify: ^4.0.0-next.0 => 4.0.0-next.0
gatsby-plugin-netlify-cms: ^6.1.0 => 6.2.0
gatsby-plugin-postcss: ^5.1.0 => 5.2.0
gatsby-plugin-react-helmet: ^5.1.0 => 5.2.0
gatsby-plugin-sharp: ^4.1.4 => 4.2.0
gatsby-remark-copy-linked-files: ^5.1.0 => 5.2.0
gatsby-remark-images: ^6.1.4 => 6.2.0
gatsby-remark-relative-images: ^2.0.2 => 2.0.2
gatsby-remark-responsive-iframe: ^5.1.0 => 5.2.0
gatsby-remark-smartypants: ^5.1.0 => 5.2.0
gatsby-source-filesystem: ^4.1.3 => 4.2.0
gatsby-transformer-remark: ^5.1.4 => 5.2.0
gatsby-transformer-sharp: ^4.1.0 => 4.2.0
npmGlobalPackages:
gatsby-cli: 4.2.0
gatsby: 3.5.0
Config Flags
PRESERVE_FILE_DOWNLOAD_CACHE: true
Issue Analytics
- State:
- Created 2 years ago
- Reactions:14
- Comments:65 (24 by maintainers)
@LekoArts I have come to the conclusion together with other teammates working on the same project that we cannot even make a proper assessment why this problem occurs and we are not able to detect if it’s a resource problem, a graphQL problem or if it has something to do with gatsby internals.
I see that many people are struggling with this error. Hence, there may be a common drawback of the new version. Therefore, it should be handled with high priority.
For me, this problem happened for a long time, tried directly with Gatsby Cloud to work on a solution, but they basically shrugged to the issue. I could get to the bottom of the problem: it happens on
gatsby-plugin-sharp
, when there are too many images to process. I tried all these variables, I tried to change the underlying code to throttle the image processing etc, and couldn’t get to any point I was happy with.Then I decided to just stick with using Shopify’s CDN instead of processing the images and never looked back. But I know that this is a bummer and not everyone can “disable” image processing.