Piping data but only on 200 OK
See original GitHub issueI want to download/upload large files with superagent. .pipe()
conveniently streams data, but it inconveniently ignores all HTTP errors. I want to separate HTTP 404, 500, 403, etc as errors to stderr instead of writing their error pages out as if they were my content.
I’m having trouble wrapping my head around why this isn’t allowed:
// get.js
let request = require('superagent');
var req = request.get(process.argv[2])
req.buffer(false)
req.then(res => {
console.log(res.status)
res.pipe(process.stdout)
})
.catch(err => {
console.log('error!')
console.log(err.status, err.message)
})
$ node get.js https://google.com
(node:44322) [DEP0066] DeprecationWarning: OutgoingMessage.prototype._headers is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
200
error!
undefined .end() was called twice. This is not supported in superagent
$ node get.js https://google.com/404
error!
404 Not Found
I get the exact same output if I use req
as suggested in https://github.com/visionmedia/superagent/issues/1188 (as suggested in the docs):
req.buffer(false)
req.then(res => {
console.log(res.status, res.message)
- res.pipe(process.stdout)
+ req.pipe(process.stdout)
})
.catch(err => {
console.log('error!')
I understand that .pipe()
and .then()
are incompatible because .then()
triggers the parsing system (https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995106) – but https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995580 makes it sound like .buffer(false)
should sidestep the parsing system.
I am wondering how to download data in a stream without buffering it all while also looking at the headers.
There is this comment https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995580
It is in a weird place that only your parser sees. #950
but I’m having trouble understanding that. https://github.com/visionmedia/superagent/issues/950#issuecomment-282112032 offers
I think buffered responses should be the default, and unbuffered responses should be only opt-in.
but that doesn’t explain to me how to get the body content.
I’m trying to achieve what I can do in python-requests
: https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests/16696317#16696317
def download_file(url):
local_filename = url.split('/')[-1]
# NOTE the stream=True parameter below
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192):
# If you have chunk encoded response uncomment if
# and set chunk_size parameter to None.
#if chunk:
f.write(chunk)
return local_filename
r.raise_for_status()
checks r.status
– which it has because at that point it has parsed the HTTP headers – and throws an exception on non-200s. But the rest of the data hasn’t been read off the socket at that point so it won’t crash on e.g. 2GB files.
–
I’m putting this out there in case you, or any user of the library, has a better clue how to do this than I’ve found so far. I don’t expect you to go out of your way to solve my problem for me. Thanks for your work on superagent, and I hope you are somewhere with friends and family during the lockdowns.
Issue Analytics
- State:
- Created 3 years ago
- Comments:5
Top GitHub Comments
Third time’s the charm: it turns out the last version works perfectly, except for
application/json
, because of this hardcoded exception just for JSON:https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/index.js#L1084-L1087
This circumvents that and is just as good:
Nevermind, my solution doesn’t work. It works on some files, but if
Content-Type: application/octet-stream
(or presumably one of the other types in:https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/parsers/index.js#L1-L9
)
then my
source.on('data', ...)
is ignored and nothing comes out. I expected setting.buffer(false)
would overrule the mimetype detection; but it doesn’t, in fact the mimetype detection overrules it atparser = exports.parse[mime]
:https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/index.js#L1055-L1082
I tried to circumvent this by setting an explicit parser that did nothing:
but this didn’t work and I don’t know why. The mimetype detection is strong and works even when I work against it.
Anyway, I figured out what the reliable thing to do is. As mentioned https://github.com/visionmedia/superagent/issues/1161#issuecomment-361930999 and @julien-f you can replace
.then()
withreq.on('response')
to catch the response after the headers are parsed but before the body is processed ((but this only works if you also set.buffer(false)
)) and in there you can use.on('data')
as above; unfortunately, this event happens before the automatic error-checking kicks in, but I dug that out and patched it in too: