question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Workaround AWS Lambda & S3: reading from ReadStream doesn't work, but Buffer does

See original GitHub issue

This is related to #337 but since it’s an issue with readSync correctly identifying/reading a Stream, I’ll put my problem and solution here:

On an AWS Lambda function trying to read an excel file from S3, usually you’d pick the file via ReadStream:

var file = s3.getObject({Bucket: bucket, Key: key}).createReadStream();
var workbook = XLSX.read(file);

This code throws the following:

TypeError: f.substr is not a function
at firstbyte (/var/task/node_modules/xlsx/xlsx.js:11362:41)
at Object.readSync [as read] (/var/task/node_modules/xlsx/xlsx.js:11388:14)

Line https://github.com/SheetJS/js-xlsx/blob/master/xlsx.js#L11359 is the culprit, because at this point f isn’t supposed to be a String, so reading it’s first byte as such doesn’t work. So, if reading directly from a ReadStream doesn’t work to open a file, wrapping around a Buffer is enough:

    var file = s3.getObject({Bucket: bucket, Key: key}).createReadStream();
    var buffers = [];
    file.on('data', function(data) {
        buffers.push(data);
    });
    file.on('end', function() {
        var buffer = Buffer.concat(buffers);
        var workbook = XLSX.read(buffer);
    });

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:1
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
annjawncommented, Nov 10, 2018

That didn’t quite work for me especially since I am reading a .csv file with double quoted fields and writing to another bucket a resulting XLS file. What somewhat worked for me is below.

csv looks like -

"Some, value", "another value", "1.235","22","SomeMore"
"Some, value", "another value", "1.235","22","SomeMore"
const rl = readline.createInterface({
        input: s3.getObject(ReadParams).createReadStream()
    });
    rl.on('line', function(line) {
        console.log(line);        
        aoa.push(line.slice(1, -1).split("\",\""));
    })
    .on('close', function() {
        var ws = XLSX.utils.aoa_to_sheet(aoa);
        var wb = XLSX.utils.book_new();
        XLSX.utils.book_append_sheet(wb, ws, "Products");
        var buf = XLSX.write(wb, {type:'buffer', bookType: "xlsx"});
        s3.putObject({
            Bucket: destBucket,
            Key: 'TestExcel1234.xlsx',
            Body: new Buffer(buf)
        }, function(err, data) {
            if (err) {
                console.log("Cannot put object:",err)
            } else {
                console.log("Successfully uploaded data excel");
            }
   
         });
    });
1reaction
hqh07commented, Feb 4, 2020

That didn’t quite work for me especially since I am reading a .csv file with double quoted fields and writing to another bucket a resulting XLS file. What somewhat worked for me is below.

csv looks like -

"Some, value", "another value", "1.235","22","SomeMore"
"Some, value", "another value", "1.235","22","SomeMore"
const rl = readline.createInterface({
        input: s3.getObject(ReadParams).createReadStream()
    });
    rl.on('line', function(line) {
        console.log(line);        
        aoa.push(line.slice(1, -1).split("\",\""));
    })
    .on('close', function() {
        var ws = XLSX.utils.aoa_to_sheet(aoa);
        var wb = XLSX.utils.book_new();
        XLSX.utils.book_append_sheet(wb, ws, "Products");
        var buf = XLSX.write(wb, {type:'buffer', bookType: "xlsx"});
        s3.putObject({
            Bucket: destBucket,
            Key: 'TestExcel1234.xlsx',
            Body: new Buffer(buf)
        }, function(err, data) {
            if (err) {
                console.log("Cannot put object:",err)
            } else {
                console.log("Successfully uploaded data excel");
            }
   
         });
    });

Awesome ! Thanks sir

Read more comments on GitHub >

github_iconTop Results From Across the Web

Stream-to-Stream S3 Uploads with AWS Lambda - Medium
Transforming the content of large S3 objects with Lambda seems daunting, but a simple stream adapter makes some amazing things possible.
Read more >
Serving zip file from s3 in aws lamba node.js - Stack Overflow
Your file.Body should already be a Buffer, so Buffer.from(file.Body) should be unnecessary but unharmful. I think your problem is that ...
Read more >
Get an object from an Amazon S3 bucket using an AWS SDK
Get an object from an Amazon S3 bucket using an AWS SDK. PDFRSS. The following code examples show how to read data from...
Read more >
Decoding protobuf messages using AWS Lambda
This blog post shows you how to decode protobuf messages in a data stream processing application using AWS Lambda functions. Overview. Solution ......
Read more >
AWS Lambda – FAQs
At the core of serverless computing is AWS Lambda, which lets you run your ... such as changes to Amazon S3 buckets, updates...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found