question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Gatsby develop and build hang on source and transform nodes stage on large CSV file with extremely high ram usage

See original GitHub issue

Preliminary Checks

Description

When using gatsby build or gatsby develop with the plugin gatsby-transformer-csv with a largeish csv file ~16’000 rows it hangs on the ‘source and transform nodes’ stage seemingly indefinitely. Also the nodejs process is using approximately 50Gb out of my 64gb of memory. Ive tried this with a new starter of gatsby 4 and created a new csv file with the same length to make sure it wasnt an issue the one project or the file was somehow corrupted/invalid.

I know a similar issue was mentioned here : https://github.com/gatsbyjs/gatsby/issues/11839#issue-411225434 but this one doesnt mention high memory usage so may have a different cause

Reproduction Link

https://github.com/DanielPBliss/GatsbySourceAndTransformIssue

Steps to Reproduce

  1. Clone linked repo
  2. Run ‘gatsby build’ or ‘gatsby develop’

Expected Result

Build to complete and Ram usage to stay within normal range

Actual Result

Build hangs on ‘source and transform nodes’ stage seemingly indefinitely and ram usage quickly accumulates to about 50gb out of my 64gb.

Environment

System:
    OS: Windows 10 10.0.19042
    CPU: (8) x64 Intel(R) Core(TM) i7-7820HK CPU @ 2.90GHz 
  Binaries:
    Node: 14.15.5 - C:\Program Files\nodejs\node.EXE       
    Yarn: 1.22.5 - C:\Program Files (x86)\Yarn\bin\yarn.CMD
    npm: 6.14.11 - C:\Program Files\nodejs\npm.CMD
  Languages:
    Python: 3.8.3 - C:\Python38\python.EXE
  Browsers:
    Edge: Spartan (44.19041.1266.0), Chromium (95.0.1020.40)
  npmPackages:
    gatsby: ^4.1.0 => 4.1.0
    gatsby-plugin-gatsby-cloud: ^4.1.0 => 4.1.0
    gatsby-plugin-image: ^2.1.0 => 2.1.0
    gatsby-plugin-manifest: ^4.1.0 => 4.1.0
    gatsby-plugin-offline: ^5.1.0 => 5.1.0
    gatsby-plugin-react-helmet: ^5.1.0 => 5.1.0
    gatsby-plugin-sharp: ^4.1.0 => 4.1.0
    gatsby-source-filesystem: ^4.1.0 => 4.1.0
    gatsby-transformer-csv: ^4.1.0 => 4.1.0
    gatsby-transformer-sharp: ^4.1.0 => 4.1.0
  npmGlobalPackages:
    gatsby-cli: 3.14.2

Config Flags

No response

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:17 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
joernroedercommented, Feb 11, 2022

I might be able to create a pull request for this over the weekend. Just need the time to look into it.

1reaction
witcradgcommented, Jan 4, 2022

I’ve committed the change to my fork. I’m not sure I’m following the right procedure to hand it off but you can find it here: https://github.com/witcradg/gatsby @joernroeder If I haven’t done this as a proper PR, let me know and I’ll do it again.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resolving Out-of-Memory Issues
Gatsby build and develop are Node.js processes, which are allocated some ... Third, memory usage from the number of “nodes” stored as data...
Read more >
Gatsby Changelog | 5.3.0
Fix high memory consumption when loading large CSV file, via PR #36610 ... To create a Slice, you must first call createSlice within...
Read more >
gatsby | Yarn - Package Manager
Gatsby is a free and open source framework based on React that helps developers build blazing fast websites and apps. It combines the...
Read more >
Learn from 425 web development courses on egghead
Supabase is a collection of open-source tools that wrap around a PostgreSQL database. In this course, we look at building a realtime chat...
Read more >
homebrew-core
a2ps 4.14 Any‑to‑PostScript filter aacgain 1.8 AAC‑supporting version of mp3gain aalib 1.4rc5 Portable ASCII art graphics library aamath 0.3 Renders mathematical expressions as ASCII art
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found