File writestream errors with 408 when consuming slow readstream
See original GitHub issueI am streaming data out of a slow API and into GCS. This API is paginated, and it takes 5-20 minutes to get a response per page.
The first page of results get streamed into GCS, no problem, no matter how long the response takes. But I can never get the second page to stream through, because a 408 error is returned.
I understand that maybe keeping the write stream alive for 5-20 minute between batches is a bit much… But still, I’d expect this lib to re-open the upload on 408. Setting a timeout config doesn’t seem to help.
I’d love some ideas on how I can work around this issue, ideally without writing anything to disk.
Environment details
- OS: MacOS 11.16
- Node.js version: 12.16.1
- npm version: 6.14.11
@google-cloud/storage
version: 5.14.3
Steps to reproduce
- Create a write stream with
{ resumable: false }
- Create a read stream that pushes some data quickly, and then waits for 10-20 minutes, then pushes more data.
- Get slapped with a 408
Code to reproduce
import * as stream from 'stream';
import { Storage } from '@google-cloud/storage';
// Put your bucket here...
const BUCKET = 'PUT_BUCKET_HERE';
const FILE = 'tmp.txt';
// we set this to true whenever we hit an error,
// so that we don't wait for no reason...
let shouldPause = true;
async function pause(minutes = 20) {
console.log('Pausing for', minutes, 'minutes...');
while (minutes > 0 && shouldPause) {
console.log('Remaining:', minutes);
await new Promise(resolve => setTimeout(resolve, 1000 * 60));
minutes = minutes - 1;
}
}
async function* getDataFromSlowAPI() {
for (const page of [0, 1, 2]) {
console.log('--> Getting page:', page);
if (page > 0) await pause();
console.log('--> Received API response for page:', page);
yield page.toString();
}
}
function main() {
const storage = new Storage();
const readstream = stream.Readable.from(getDataFromSlowAPI());
const file = storage.bucket(BUCKET).file(FILE);
const writestream = file.createWriteStream({ resumable: false });
readstream
.pipe(writestream)
.on('error', function (err) {
// 408 after a while...
console.error('stream error:', err);
shouldPause = false;
})
.on('finish', function () {
console.log('stream finished');
});
}
main();
The 408 error, usually happens after 6 minutes
ApiError: Multiple errors occurred during the request. Please see the `errors` array for complete details.
1. Request Timeout
2. <!DOCTYPE html>
<html lang=en>
<meta charset=utf-8>
<meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
<title>Error 408 (Request Timeout)!!1</title>
<style>
*{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
</style>
<a href=//www.google.com/><span id=logo aria-label=Google></span></a>
<p><b>408.</b> <ins>That’s an error.</ins>
<p>Your client has taken too long to issue its request. <ins>That’s all we know.</ins>
at new ApiError (.../@google-cloud/storage/node_modules/@google-cloud/common/build/src/util.js:73:15)
at Util.parseHttpRespMessage (.../@google-cloud/storage/node_modules/@google-cloud/common/build/src/util.js:175:41)
at Util.handleResp (.../@google-cloud/storage/node_modules/@google-cloud/common/build/src/util.js:149:76)
at .../@google-cloud/storage/node_modules/@google-cloud/common/build/src/util.js:477:22
at onResponse (.../@google-cloud/storage/node_modules/retry-request/index.js:228:7)
at .../@google-cloud/storage/node_modules/teeny-request/src/index.ts:244:13
at processTicksAndRejections (internal/process/task_queues.js:97:5) {
code: 408,
errors: [],
response: PassThrough {
_readableState: ReadableState {
objectMode: false,
highWaterMark: 16384,
buffer: BufferList { head: null, tail: null, length: 0 },
length: 0,
pipes: null,
pipesCount: 0,
flowing: true,
ended: true,
endEmitted: true,
reading: false,
sync: false,
needReadable: false,
emittedReadable: false,
readableListening: false,
resumeScheduled: false,
emitClose: true,
autoDestroy: false,
destroyed: false,
defaultEncoding: 'utf8',
awaitDrainWriters: null,
multiAwaitDrain: false,
readingMore: false,
decoder: null,
encoding: null,
[Symbol(kPaused)]: false
},
readable: false,
_events: [Object: null prototype] {
prefinish: [Function: prefinish],
error: [Array],
data: [Function],
end: [Function]
},
_eventsCount: 4,
_maxListeners: undefined,
_writableState: WritableState {
objectMode: false,
highWaterMark: 16384,
finalCalled: false,
needDrain: false,
ending: true,
ended: true,
finished: true,
destroyed: false,
decodeStrings: true,
defaultEncoding: 'utf8',
length: 0,
writing: false,
corked: 0,
sync: false,
bufferProcessing: false,
onwrite: [Function: bound onwrite],
writecb: null,
writelen: 0,
afterWriteTickInfo: null,
bufferedRequest: null,
lastBufferedRequest: null,
pendingcb: 0,
prefinished: true,
errorEmitted: false,
emitClose: true,
autoDestroy: false,
bufferedRequestCount: 0,
corkedRequestsFree: [Object]
},
writable: false,
allowHalfOpen: true,
_transformState: {
afterTransform: [Function: bound afterTransform],
needTransform: false,
transforming: false,
writecb: null,
writechunk: null,
writeencoding: 'buffer'
},
statusCode: 408,
statusMessage: 'Request Timeout',
request: {
agent: [Agent],
headers: [Object],
href: 'https://storage.googleapis.com/upload/storage/v1/b/[redacted]/o?uploadType=multipart&name=tmp.txt'
},
body: '<!DOCTYPE html>\n' +
'<html lang=en>\n' +
' <meta charset=utf-8>\n' +
' <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\n' +
' <title>Error 408 (Request Timeout)!!1</title>\n' +
' <style>\n' +
' *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}\n' +
' </style>\n' +
' <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n' +
' <p><b>408.</b> <ins>That’s an error.</ins>\n' +
' <p>Your client has taken too long to issue its request. <ins>That’s all we know.</ins>\n',
headers: {
'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"',
connection: 'close',
'content-length': '1557',
'content-type': 'text/html; charset=UTF-8',
date: 'Mon, 27 Sep 2021 23:46:52 GMT',
'referrer-policy': 'no-referrer'
},
toJSON: [Function: toJSON],
[Symbol(kCapture)]: false
}
}
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (10 by maintainers)
Top Results From Across the Web
Partially consuming fs.ReadStream closes file handle ... - GitHub
Using fs.ReadStream[Symbol.asyncIterator] seems to unconditionally close the file handle (after a small delay; I guess at the end of the event loop or ......
Read more >How to Process Large Files with Node.js - Fusebit
In this code, you use createReadStream() to create a readable stream, then createWriteStream() to create a writable stream. You use the pipe() ...
Read more >NodeJS Copying File over a stream is very slow - Stack Overflow
I am copying file with Node on an SSD under VMWare, but the performance is very low. The benchmark I have run to...
Read more >Streams, Piping, and Their Error Handling in Node.js - Medium
The handler creates a readStream for file data.txt . For the data event of readStream , we have called the write method of...
Read more >Dell EMC DD OS Command Reference Guide
If the file is not present in /ddvar, then the command returns an error. Role required: admin, limited-admin. authentication kerberos reset.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks for the information @surjikal. I am going to check around with some other people on the team and see if they have encountered anything similar in other library languages.
Very cool, will check it out!