question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Crash for parsing JSON in large CrossRef query (116,808)

See original GitHub issue

I was trying to collect a large set of metadata and getpapers seemed to have trouble parsing the metadata. I had sufficient RAM remaining on my machine, so it might be a limitation of JSON parsing. I copied the query and error below. Not crucial for me right now (I’ll loop through the years instead of a large one), but cutting into smaller queries might become problematic when a restricted query also yields large number of results (e.g., month query in 2015).

Query

getpapers --api crossref -o cr-res --filter "type:journal-article,prefix:10.1016,from-pub-date:1000,until-pub-date:1886"

Error

info: Searching using crossref API
info: Found 116808 results
info: Saving result metadata
/usr/local/lib/node_modules/getpapers/lib/crossref.js:119
  var pretty = JSON.stringify(crossref.allresults, null, 2)
                    ^

RangeError: Invalid string length
    at join (native)
    at Object.stringify (native)
    at CrossRef.handleSearchResults (/usr/local/lib/node_modules/getpapers/lib/crossref.js:119:21)
    at pageQuery (/usr/local/lib/node_modules/getpapers/lib/crossref.js:41:16)
    at /usr/local/lib/node_modules/getpapers/node_modules/crossref/index.js:92:5
    at Request._callback (/usr/local/lib/node_modules/getpapers/node_modules/crossref/index.js:31:5)
    at Request.self.callback (/usr/local/lib/node_modules/getpapers/node_modules/request/request.js:198:22)
    at emitTwo (events.js:106:13)
    at Request.emit (events.js:191:7)
    at Request.<anonymous> (/usr/local/lib/node_modules/getpapers/node_modules/request/request.js:1082:10)

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
tarrowcommented, Sep 30, 2016

I’m just testing pushing up the memory limits with node --max-old-space-size=70000

My conclusion is that is doesn’t fix the issue. This is interesting because the way I read the SO links suggest that it throws this error because we hit the limit of the heap size. I bumped my heap size to 70GB. The RSS and VSZ of the process rose to a large number but not higher than 70GB.

RSS    VSZ
1189760 2376404

Just under 24 GB.

It errors at the same point as you both found on the stringify call. I’m going to look into why this happens.

0reactions
chartgerinkcommented, Sep 30, 2016

Thanks, batching is what I am doing (looping per year), but wanted to check whether this was something that needed fixing. Apparently a larger upscale problem of node (which makes sense given the size of the JSON files that are returned from crossref).

Read more comments on GitHub >

github_iconTop Results From Across the Web

SyntaxError: JSON.parse in Crossref API - Interfaces for People
I'm trying to retrieval metadata from this item (Crossref Metadata Search) with the REST API and get the error: SyntaxError: JSON.parse: ...
Read more >
Crash when sourcing from large JSON file #1597 - GitHub
Describe the bug I am trying to paginate over a pretty large JSON file, 17.2MB. Eleventy crashes when ran. Any advice?
Read more >
Alma 2020 Release Notes - Ex Libris Knowledge Center
The New Alma Viewer now supports high-resolution images such as jpeg2000 and tiff, delivered via the IIIF-based OpenSeadragon viewer.
Read more >
Client crash calling Parse() method on invalid Json
Client crashes when calling Parse() method on an invalid Json string. Client terminates calling Progress.Json.ObjectModel.ObjectModelParser: ...
Read more >
rcrossref: Client for Various 'CrossRef' 'APIs'
errors due to changes in the Crossref API that could cause parsing errors. Note that cursor feature works with both high and low...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found