question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ever increasing json file when creating resumable upload urls

See original GitHub issue

Environment details

  • OS: Linux
  • Node.js version: 10
  • npm version:
  • @google-cloud/storage version: 5.8.5

Steps to reproduce

  1. create a resumable upload url by using createResumableUpload()
  2. look at file .config/configstore/gcs-resumable-upload.json

When creating a resumable upload urls the package seems to write that to a json file that is ever increasing and never cleaned up. This makes calls to createResumableUpload slower over time because that json is parsed on each call. Another side effect are high volume of disk writes.

Not sure if this is an issue with this package or with gcs-resumable-upload because this lib is using a method from there. The file seems to be the one sent as configPath.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:3
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
ruidfigueiredocommented, Jun 17, 2021

Hi @shaffeeullah

I’d argue that feature request (described as a nice to have improvement) is not appropriate at all for this issue.

After generating a few thousand resumable upload ulrs the generation process will start taking seconds to complete, the amount of IO on disk will be very high (in one month we saw 28TB of writes because of this issue) and generating one url was taking more than 3 seconds.

3reactions
ruidfigueiredocommented, Jun 11, 2021

Hi @shaffeeullah

To reproduce first create a storage object, bucket and file. On the file call the createResumableUpload method a couple of times then check the ~./config/configstore/gcs-resumable-upload.json file and you’ll find there all the generated resumable upload urls.

This file will grow unbound and when it reaches a certain size it will make the resumable url generation process take a non reasonable amount of time and resources (as it needs to be read, parsed and written back to disk every time createResumableUrl is called).

Here’s a full sample:

const { Storage } = require('@google-cloud/storage')

const storage = new Storage({
    projectId: 'the gcloud project id',
    credentials: {
        client_email: "servcice account email or some other method of providing credentials",
        private_key: "the key",
    }
})

const bucket = storage.bucket('bucket name');

async function getResumableUploadUrl() {
    const res = await bucket
        .file(`the file name ${Math.random()}.wav`)
        .createResumableUpload({
            metadata: {
                contentType: "audio/wav",
            },
            origin: 'https://somedomain.com/'
        });

    return res[0];
}

async function main() {
    console.log(await getResumableUploadUrl())
    console.log(await getResumableUploadUrl())
    console.log(await getResumableUploadUrl())
}

main().then(() => console.log('done'), console.error);

After running this check the current user’s .config folder at ~/.config/configstore/gcs-resumable-upload.json. You’ll find the generated urls there (along with some extra information).

Extra details:

The implementation of createResumableUpload makes use of a dependency (another npm package from google) called gcs-resumable- upload. Here you can confirm that resumableUpload is an import * as resumableUpload from 'gcs-resumable-upload'

Important thing to note here is that createResumableUpload simply calls gcs-resumable-upload’s createURI.

When this function is called an instance of an Upload object gets created: https://github.com/googleapis/gcs-resumable-upload/blob/master/src/index.ts#L689

On that object’s constructor a ConfigStore is created using the packageName gcs-resumable-upload: https://github.com/googleapis/gcs-resumable-upload/blob/master/src/index.ts#L295

On actually creating the resumable upload url it is saved to this config store: https://github.com/googleapis/gcs-resumable-upload/blob/master/src/index.ts#L377

And in this scenario it is never removed (this Upload class from gcs-resumable-upload seems to have other intended scenarios other than just generating the resumable uri, and in those it seems that the uri gets clean up [I didn’t test this]).

If you look at ConfigStore’s implementation for set you’ll see it will become extremely inefficient when the config file gets large:

https://github.com/yeoman/configstore/blob/main/index.js#L32

https://github.com/yeoman/configstore/blob/main/index.js#L79

Read more comments on GitHub >

github_iconTop Results From Across the Web

Perform resumable uploads | Cloud Storage - Google Cloud
This page describes how to make a resumable upload request in the Cloud Storage JSON and XML APIs. This protocol allows you to...
Read more >
Using Java to do *resumable uploads* using a *signed url* on ...
1 Answer 1 · Create a signed URL with a "PUT" method. · Use URLFetch to throw an HTTP request to this signed...
Read more >
HTTPie 3.2.1 (latest) docs
CLI HTTP that will make you smile. JSON and sessions support, syntax highlighting, wget-like downloads, plugins, and more.
Read more >
Uploading and copying objects using multipart upload
Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object's...
Read more >
ASP.NET Core Blazor file uploads - Microsoft Learn
For testing, the preceding URLs are configured in the projects' Properties/launchSettings.json files.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found