question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Build stuck at running jobs (image transformation)

See original GitHub issue

If you’re coming new to this issue, please see this first: https://github.com/gatsbyjs/gatsby/issues/34051#issuecomment-979882343


Preliminary Checks

Description

Gatsby’s build process is hanging and not completing. I suspect the issue is with Sharp, as my site has quite a few images, and I saw this brought up in a previous issue, #33557.

When I upgraded to v4 I initially had no issues. However, the next day my builds all started going exceeding Netlify’s maximum build time of 30 minutes.

I mentioned this problem in the thread to the other issue, as others apparently had the same problem where run queries in workers seems to take longer than expected.

This issue is difficult to reproduce because I think in part it is to do with the scale of my site, which is moderately large and has ~1600 images. There must be something that isn’t quite right in the worker process because my builds on netlify went from roughly taking around 13 or 14 minutes, to exceeding the build limit every time.

To try and diagnose the issue I tried a local build, which while it took a long-ish time, did actually complete

Since @LekoArts suggested that Gatsby Cloud’s build process is better optimised for processing images, I thought I’d give that a go.

After trying out a build in Gatsby Cloud, I had no build problems at all and the whole site build with a clear cache in 7 minutes. OK, I thought, seems like the problem isn’t so much with Gatsby, but in how Netlify is interacting with v4’s worker process.

However, the next push I ran into the problem once again, this time in Gatsby Cloud. The bottom end of Gatsby Cloud’s logs are useful, because they give me a little more information than Netlify:

17:38:38 PM:
info Total nodes: 7987, SitePage nodes: 1695 (use --verbose for breakdown)

17:38:38 PM:
success Checking for changed pages - 0.001s

17:38:38 PM:
success onPreExtractQueries - 0.000s

17:38:38 PM:
success Cleaning up stale page-data - 0.024s

17:38:38 PM:
success createPages - 1.351s

17:38:40 PM:
success extract queries from components - 1.596s

17:38:40 PM:
success write out redirect data - 0.004s

17:38:40 PM:
success onPostBootstrap - 0.046s

17:38:40 PM:
success write out requires - 0.030s

17:38:40 PM:
info bootstrap finished - 48.635s

17:39:15 PM:
warning warn - You have enabled the JIT engine which is currently in preview.

17:39:15 PM:
warning warn - Preview features are not covered by semver, may introduce breaking changes, and can change at any time.

17:39:15 PM:
warning ⠀

17:39:22 PM:
success Building production JavaScript and CSS bundles - 42.093s

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:24 PM:
 [webpack.cache.PackFileCacheStrategy] Serializing big strings (3319kiB) impacts deserialization performance (consider using Buffer instead and decode when needed)

17:39:59 PM:
success Building Rendering Engines - 37.719s

17:40:13 PM:
success Building HTML renderer - 13.051s

17:40:13 PM:
success Execute page configs - 0.039s

17:40:15 PM:
success Caching Webpack compilations - 0.001s

17:40:15 PM:
success Validating Rendering Engines - 2.094s

17:40:39 PM:
success run queries in workers - 23.276s - 1662/1662 71.40/s

17:45:38 PM:
warning This is just diagnostic information (enabled by GATSBY_DIAGNOSTIC_STUCK_STATUS_TIMEOUT):

17:45:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

17:45:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 300.000 seconds. Activities preventing Gatsby from transitioning to idle state:

17:45:38 PM:
Process will be terminated in 1500.000 seconds if nothing will change.

17:45:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

18:10:38 PM:
ERROR Terminating the process (due to GATSBY_WATCHDOG_STUCK_STATUS_TIMEOUT):

18:10:38 PM:
- Activity "build" of type "hidden" is currently in state "IN_PROGRESS"

18:10:38 PM:
Gatsby is in "IN_PROGRESS" state without any updates for 1800.000 seconds. Activities preventing Gatsby from transitioning to idle state:

18:10:38 PM:
- Activity "Running jobs v2" of type "hidden" is currently in state "IN_PROGRESS"

The fact that a full, uncached build on Gatsby Cloud can run in 7 minutes, suggests to me that actually the issue isn’t one of scale, but that the worker process is hanging, but only sometimes.

Is it to do with incremental builds? Maybe. I am using the preserved download cache, because as I said my site has quite a few images which are coming from a custom source plugin (which is relatively simple, and contains all the image links from AWS that are passed over to createRemoteFileNode).

To test things out once I had the first timeout on Gatsby Cloud, I tested a manual deploy without clearing the cache. I was hoping the process would hang again so I’d know the issue was with the cache and incremental builds, but alas, it did not. The whole build was completed in 6 minutes. Strangely, the issue does appear to occur on Netlify more frequently than not, and happens more occasionally in Gatsby Cloud. It may be to do with build process resources, because I just signed up to Gatsby Cloud, and so am in the free preview of performance builds.

Are there other diagnostic tools I can use to more closely inspect the build process? How would I be able to see which process is failing or never finishing?

Reproduction Link

I can’t seem to reproduce this error as it is intermittent

Steps to Reproduce

  1. Attempt to build site with gatsby build in either Netlify or Gatsby Cloud
  2. Sometimes, the build never finishes

Expected Result

gatsby build should eventually finish and build the site

Actual Result

The state run queries in workers never finishes/moves on to merge worker state, the build eventually times out and fails.

Environment

My local environment isn't really the issue, builds have failed in both Netlify and Gatsby Cloud with this problem.

However, this is my local env:

  System:
    OS: macOS Mojave 10.14.6
    CPU: (4) x64 Intel(R) Core(TM) i7-4578U CPU @ 3.00GHz
    Shell: 3.2.57 - /bin/bash
  Binaries:
    Node: 16.1.0 - /usr/local/bin/node
    npm: 8.1.4 - /usr/local/bin/npm
  Languages:
    Python: 3.9.5 - /usr/local/opt/python/libexec/bin/python
  Browsers:
    Chrome: 95.0.4638.69
    Firefox: 94.0.1
    Safari: 14.1.2
  npmPackages:
    gatsby: ^4.1.6 => 4.2.0 
    gatsby-plugin-gdpr-cookies: ^2.0.8 => 2.0.8 
    gatsby-plugin-image: ^2.1.3 => 2.2.0 
    gatsby-plugin-loadable-components-ssr: ^4.1.0 => 4.1.0 
    gatsby-plugin-local-search: ^2.0.1 => 2.0.1 
    gatsby-plugin-netlify: ^4.0.0-next.0 => 4.0.0-next.0 
    gatsby-plugin-netlify-cms: ^6.1.0 => 6.2.0 
    gatsby-plugin-postcss: ^5.1.0 => 5.2.0 
    gatsby-plugin-react-helmet: ^5.1.0 => 5.2.0 
    gatsby-plugin-sharp: ^4.1.4 => 4.2.0 
    gatsby-remark-copy-linked-files: ^5.1.0 => 5.2.0 
    gatsby-remark-images: ^6.1.4 => 6.2.0 
    gatsby-remark-relative-images: ^2.0.2 => 2.0.2 
    gatsby-remark-responsive-iframe: ^5.1.0 => 5.2.0 
    gatsby-remark-smartypants: ^5.1.0 => 5.2.0 
    gatsby-source-filesystem: ^4.1.3 => 4.2.0 
    gatsby-transformer-remark: ^5.1.4 => 5.2.0 
    gatsby-transformer-sharp: ^4.1.0 => 4.2.0 
  npmGlobalPackages:
    gatsby-cli: 4.2.0
    gatsby: 3.5.0

Config Flags

PRESERVE_FILE_DOWNLOAD_CACHE: true

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:14
  • Comments:65 (24 by maintainers)

github_iconTop GitHub Comments

12reactions
SebastianMeracommented, Mar 4, 2022

@LekoArts I have come to the conclusion together with other teammates working on the same project that we cannot even make a proper assessment why this problem occurs and we are not able to detect if it’s a resource problem, a graphQL problem or if it has something to do with gatsby internals.

I see that many people are struggling with this error. Hence, there may be a common drawback of the new version. Therefore, it should be handled with high priority.

4reactions
buzinascommented, Mar 4, 2022

For me, this problem happened for a long time, tried directly with Gatsby Cloud to work on a solution, but they basically shrugged to the issue. I could get to the bottom of the problem: it happens on gatsby-plugin-sharp, when there are too many images to process. I tried all these variables, I tried to change the underlying code to throttle the image processing etc, and couldn’t get to any point I was happy with.

Then I decided to just stick with using Shopify’s CDN instead of processing the images and never looked back. But I know that this is a bummer and not everyone can “disable” image processing.

Read more comments on GitHub >

github_iconTop Results From Across the Web

gatsby stuck develop build on IMAGE_PROCESSING
The build process freezes at the moment Running gatsby-plugin-sharp.IMAGE_PROCESSING jobs. System: OS: macOS 12.4 CPU: (12) x64 Intel(R) ...
Read more >
GeneratePendingTransforms stuck every time
I think to solve your issue we're going to need a bit more of information. A couple of things you can try: set...
Read more >
Job stuck on running even after runner terminates correctly
Summary When running a pipeline, the runner exists correctly but the job gets stuck in "running" state.
Read more >
T138281 Some Jenkins jobs tend to be stuck and never times ...
Executor.run(Executor.java:410) 07:53:51 Build step 'Execute shell' marked build as failure 07:54:00 ... P3276 full Jenkins thread dump of a build stuck ...
Read more >
From Stuck to Fulfilled: A Career Transformation Course | AppSumo
A plan is nothing without acting on it, so you'll check off each step, one-by-one, and learn how to build your confidence at...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found