question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Textile Buckets: Issue with Stream Upload for larger file sizes, even when HighWaterMarked.

See original GitHub issue

Introduction

I’d like to use Readable Streams (or any stream) to send data to the Textile pushPath method.

  • I don’t want to have a temporary local file I add and remove.
  • Helps me avoid bad failure states.

There are 3 strategies in this issue, the first two are designed to work with Amazon S3, so I would expect they work with Textile’s Buckets too.

Current implementations

Broken for large files (over 5mb) stream implementation: https://github.com/filecoin-project/slate/blob/main/node_common/upload.js

FS.createReadStream implementation (In use now / works): https://github.com/filecoin-project/slate/blob/main/node_common/upload-fs.js

First stream implementation attempt.

This is the first implementation attempt at getting streams to work with the PassThrough stream constructor.


import * as LibraryManager from "~/node_common/managers/library";
import * as Utilities from "~/node_common/utilities";

import FORM from "formidable";

import { PassThrough } from "stream";

export const formMultipart = (req, res, { user }) =>
  new Promise(async (resolve, reject) => {
    const f = new FORM.IncomingForm();
    const p = new PassThrough({ highWaterMark: 1024 * 1024 * 3 });
    const file = {};

    const { buckets, bucketKey } = await Utilities.getBucketAPIFromUserToken(user.data.tokens.api);

    f.keepExtensions = true;

    f.onPart = (part) => {
      if (!part.filename) {
        form.handlePart(part);
        return;
      }

      file.name = part.filename;
      file.type = part.mime;

      part.on("data", function (buffer) {
        p.write(buffer);
      });

      part.on("end", function (data) {
        p.end();
      });
    };

    f.on("progress", (bytesReceived, bytesExpected) => {
      // console.log({ bytesReceived, bytesExpected });
    });

    f.parse(req, async (e) => {
      if (e) {
        return reject({});
      }

      if (!file && !file.name) {
        return reject({});
      }

      // NOTE(jim): Creates a Slate compatable Data object.
      const data = LibraryManager.createLocalDataIncomplete(file);

      let push;
      try {
        push = await buckets.pushPath(bucketKey, data.name, p);
      } catch (e) {
        return reject({});
      }

      return resolve({});
    });
  });

Results

  • When I upload 5mb files, everything works as expected.
  • When I upload 25mb files, everything works as expected.
  • When I upload 70mb+ files, the stream is constructed (PassThrough) as expected, but I get the following error from Textile
{
  decorator: 'SERVER_BUCKETS_VERIFY_ISSUE',
  error: true,
  message: Error: Auth expired. Consider calling withKeyInfo or withAPISig to refresh.
      at Object.<anonymous> (/Users/whiteharbor/Development/slate/node_modules/@textile/security/src/index.ts:146:32)
      at Module._compile (internal/modules/cjs/loader.js:1200:30)
      at Module._compile (/Users/whiteharbor/Development/slate/node_modules/pirates/lib/index.js:99:24)
      at Module._extensions..js (internal/modules/cjs/loader.js:1220:10)
      at Object.newLoader [as .js] (/Users/whiteharbor/Development/slate/node_modules/pirates/lib/index.js:104:7)
      at Module.load (internal/modules/cjs/loader.js:1049:32)
      at Function.Module._load (internal/modules/cjs/loader.js:937:14)
      at Module.require (internal/modules/cjs/loader.js:1089:19)
      at require (internal/modules/cjs/helpers.js:73:18)
      at Object.<anonymous> (/Users/whiteharbor/Development/slate/node_modules/@textile/context/src/index.ts:2:1)
}

If you’re curious about the Stream, whether or not the node constructor is getting the correct _writeableState or _readableState have the correct HighWaterMark value, I have checked and it seems correct:

_writableState: WritableState {
    objectMode: false,
    highWaterMark: 3145728,
    finalCalled: false,
    needDrain: false,
    ending: true,
    ended: true,
    finished: false,
    destroyed: false,
    decodeStrings: true,
    defaultEncoding: 'utf8',
    length: 791699,
    writing: true,
    corked: 0,
    sync: false,
    bufferProcessing: false,
}

Thank you Textile!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6

github_iconTop GitHub Comments

github_iconTop Results From Across the Web

Resolve issues with uploading large files in Amazon S3
I'm trying to upload a large file (1 GB or larger) to Amazon Simple ... Note: If you use the Amazon S3 console,...
Read more >
Buckets - Textile Docs
Create private Buckets where your app users can store data. (Soon) Archive Bucket data on Filecoin to ensure long-term security and access to...
Read more >
Large files within Queues & Streams | by Matthew Bill - Medium
For AWS SQS the size is even smaller with just 256KB. ... relies upon the consuming processor to know which S3 bucket to...
Read more >
CVE - Search Results - MITRE
A malicious crafted dwf or .pct file when consumed through DesignReview.exe application could lead to memory corruption vulnerability by read access violation.
Read more >
IBM Cloud Object Storage Developer Focused Tutorials
This edition applies to IBM Cloud Object Storage Version 1.0. This document ... page 151, which cover uploading large amounts of training data...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found