Crash for parsing JSON in large CrossRef query (116,808)
See original GitHub issueI was trying to collect a large set of metadata and getpapers
seemed to have trouble parsing the metadata. I had sufficient RAM remaining on my machine, so it might be a limitation of JSON parsing. I copied the query and error below. Not crucial for me right now (I’ll loop through the years instead of a large one), but cutting into smaller queries might become problematic when a restricted query also yields large number of results (e.g., month query in 2015).
Query
getpapers --api crossref -o cr-res --filter "type:journal-article,prefix:10.1016,from-pub-date:1000,until-pub-date:1886"
Error
info: Searching using crossref API
info: Found 116808 results
info: Saving result metadata
/usr/local/lib/node_modules/getpapers/lib/crossref.js:119
var pretty = JSON.stringify(crossref.allresults, null, 2)
^
RangeError: Invalid string length
at join (native)
at Object.stringify (native)
at CrossRef.handleSearchResults (/usr/local/lib/node_modules/getpapers/lib/crossref.js:119:21)
at pageQuery (/usr/local/lib/node_modules/getpapers/lib/crossref.js:41:16)
at /usr/local/lib/node_modules/getpapers/node_modules/crossref/index.js:92:5
at Request._callback (/usr/local/lib/node_modules/getpapers/node_modules/crossref/index.js:31:5)
at Request.self.callback (/usr/local/lib/node_modules/getpapers/node_modules/request/request.js:198:22)
at emitTwo (events.js:106:13)
at Request.emit (events.js:191:7)
at Request.<anonymous> (/usr/local/lib/node_modules/getpapers/node_modules/request/request.js:1082:10)
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
SyntaxError: JSON.parse in Crossref API - Interfaces for People
I'm trying to retrieval metadata from this item (Crossref Metadata Search) with the REST API and get the error: SyntaxError: JSON.parse: ...
Read more >Crash when sourcing from large JSON file #1597 - GitHub
Describe the bug I am trying to paginate over a pretty large JSON file, 17.2MB. Eleventy crashes when ran. Any advice?
Read more >Alma 2020 Release Notes - Ex Libris Knowledge Center
The New Alma Viewer now supports high-resolution images such as jpeg2000 and tiff, delivered via the IIIF-based OpenSeadragon viewer.
Read more >Client crash calling Parse() method on invalid Json
Client crashes when calling Parse() method on an invalid Json string. Client terminates calling Progress.Json.ObjectModel.ObjectModelParser: ...
Read more >rcrossref: Client for Various 'CrossRef' 'APIs'
errors due to changes in the Crossref API that could cause parsing errors. Note that cursor feature works with both high and low...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I’m just testing pushing up the memory limits with
node --max-old-space-size=70000
My conclusion is that is doesn’t fix the issue. This is interesting because the way I read the SO links suggest that it throws this error because we hit the limit of the heap size. I bumped my heap size to 70GB. The RSS and VSZ of the process rose to a large number but not higher than 70GB.
Just under 24 GB.
It errors at the same point as you both found on the stringify call. I’m going to look into why this happens.
Thanks, batching is what I am doing (looping per year), but wanted to check whether this was something that needed fixing. Apparently a larger upscale problem of node (which makes sense given the size of the JSON files that are returned from crossref).