question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

index.json in S3 grows indefinitely and causes errors

See original GitHub issue

Description

Publishing a package that is 3Mb will be saved to s3 on its own but also to the index.json:

(src/put/publish.js)

    json['dist-tags'][tag] = version;
    json._attachments[`${name}-${version}.tgz`] = pkg._attachments[`${name}-${version}.tgz`];
    json.versions[version] = versionData;

    ...

    await storage.put(
      `${name}/${version}.tgz`,
      json._attachments[`${name}-${version}.tgz`].data, // eslint-disable-line no-underscore-dangle
      'base64',
    );

    await storage.put(
      `${name}/index.json`,
      JSON.stringify(json),
    );

If you publish 100 times, the index.json will be roughly 300Mb, which will fail or be grossly inefficient:

    const pkgBuffer = await storage.get(`${name}/index.json`);
    json = JSON.parse(pkgBuffer.toString());

Would the solution be to clear out json._attachements prior to saving to index.json? What recommendations do you have since I have already run into the issue, can I simply delete the entirety of the bucket in s3 if I don’t care about past releases?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
jonsharrattcommented, Aug 21, 2017

So this existed as I had to piece together how npm works. Turns out they used couchdb so this was why I copied over that functionallity.

They hit the same perf issues and look to have taken the attachments out of the package.json themselves.

Merged your proposed fix @ganapativs and tagged a release, give it a go and let me know how you get on.

Sorry for the delay on this been a crazy past couple of months.

1reaction
k-kcommented, Jun 3, 2017

@jonsharratt Im guessing the get request removed attachments, but does the PUT request attempt to return the patched index.json as a response - and does not remove attachments?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resolve JSON errors in Amazon Athena - AWS
Run a command similar to the following: CREATE EXTERNAL TABLE IF NOT EXISTS json_validator (jsonrow string) ROW FORMAT DELIMITED FIELDS ...
Read more >
How to index JSON files stored in S3 by keys? - Stack Overflow
Amazon S3 Select works on objects stored in CSV, JSON, or Apache Parquet format. Full docs on S3 Select. Here's a nice blog...
Read more >
Troubleshoot Dataflow errors | Google Cloud
When running in streaming mode, a bundle including a failing item is retried indefinitely, which might cause your pipeline to permanently stall.
Read more >
awswrangler.s3.to_json adds __index_level_0__ to table ...
When using to.json to write json files s3 and create a glue table at the same time using "orient='records' and lines=True, there appears...
Read more >
Changelog - Cypress Documentation
testIsolation=false caused invalid configuration validation when running cypress ... An error will be thrown if both a cypress.json file and cypress.config.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found