Streaming CSV Parser
See original GitHub issueIs there a plan to support streaming csv bulk queries to lessen impact on memory when the result set is more than a few hundred megabytes?
It looks like the current code is downloading the entire CSV into memory, parsing it, and then emitting record
events.
Issue Analytics
- State:
- Created 8 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
CSV Parse - Stream API
The main module exported by this package implements the native Node.js transform stream. Transform streams implement both the Readable and Writable interfaces.
Read more >mafintosh/csv-parser - GitHub
Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum CSV acid test suite. csv-parser can convert CSV...
Read more >Streaming and Parsing a CSV File - northCoder
Parsing a CSV file is a simple task, with great library support. This short example uses Commons CSV, together with Java streaming ......
Read more >how to use read and write stream of csv-parse - Stack Overflow
The following code uses streams to add a new column. The file I have used was about 500MB and the maximum utilized RAM...
Read more >csv stream - npm search
Fast and powerful CSV parser for the browser that supports web workers and streaming large files. Converts CSV to JSON and JSON to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Definitely there is a plan but not in milestone yet.
Thanks guys. I’m just tackling this problem as I have a free week at work this week and it is an issue that has bugged me for a while (and something that more and more of our customers are starting to hit for a handful of object types). Doesn’t have to be resolved right away however as we have been living with it for a while already.
But I definitely see that calling stream() on the ResultStream will fall into the CsvStreamConvertor.parse() function which hits the problematic parsing code. For now I’ll just stream it out to a CSV file on disk using the existing parser (oh well), then parse that file on my own. This will allow me to get something working, and have most of the necessary code scaffolding in place. When 1.7 comes out, I can hopefully update to that, tweak a few things, and be all set to do it in a single pass (without ever loading the entire thing into memory).
I really appreciate all of the info.