question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Piping data but only on 200 OK

See original GitHub issue

I want to download/upload large files with superagent. .pipe() conveniently streams data, but it inconveniently ignores all HTTP errors. I want to separate HTTP 404, 500, 403, etc as errors to stderr instead of writing their error pages out as if they were my content.

I’m having trouble wrapping my head around why this isn’t allowed:

// get.js
let request = require('superagent');

var req = request.get(process.argv[2])
req.buffer(false)
req.then(res => {
   console.log(res.status)
   res.pipe(process.stdout)
 })
 .catch(err => {
   console.log('error!')
   console.log(err.status, err.message)
 })
$ node get.js https://google.com
(node:44322) [DEP0066] DeprecationWarning: OutgoingMessage.prototype._headers is deprecated
(Use `node --trace-deprecation ...` to show where the warning was created)
200
error!
undefined .end() was called twice. This is not supported in superagent
$ node get.js https://google.com/404
error!
404 Not Found

I get the exact same output if I use req as suggested in https://github.com/visionmedia/superagent/issues/1188 (as suggested in the docs):

 req.buffer(false)
 req.then(res => {
    console.log(res.status, res.message)
-   res.pipe(process.stdout)
+   req.pipe(process.stdout)
  })
  .catch(err => {
    console.log('error!')

I understand that .pipe() and .then() are incompatible because .then() triggers the parsing system (https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995106) – but https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995580 makes it sound like .buffer(false) should sidestep the parsing system.

I am wondering how to download data in a stream without buffering it all while also looking at the headers.

There is this comment https://github.com/visionmedia/superagent/issues/1187#issuecomment-281995580

It is in a weird place that only your parser sees. #950

but I’m having trouble understanding that. https://github.com/visionmedia/superagent/issues/950#issuecomment-282112032 offers

I think buffered responses should be the default, and unbuffered responses should be only opt-in.

but that doesn’t explain to me how to get the body content.


I’m trying to achieve what I can do in python-requests: https://stackoverflow.com/questions/16694907/download-large-file-in-python-with-requests/16696317#16696317

def download_file(url):
    local_filename = url.split('/')[-1]
    # NOTE the stream=True parameter below
    with requests.get(url, stream=True) as r:
        r.raise_for_status()
        with open(local_filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=8192): 
                # If you have chunk encoded response uncomment if
                # and set chunk_size parameter to None.
                #if chunk: 
                f.write(chunk)
    return local_filename

r.raise_for_status() checks r.status – which it has because at that point it has parsed the HTTP headers – and throws an exception on non-200s. But the rest of the data hasn’t been read off the socket at that point so it won’t crash on e.g. 2GB files.

I’m putting this out there in case you, or any user of the library, has a better clue how to do this than I’ve found so far. I don’t expect you to go out of your way to solve my problem for me. Thanks for your work on superagent, and I hope you are somewhere with friends and family during the lockdowns.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5

github_iconTop GitHub Comments

2reactions
kousucommented, Jun 30, 2020

Third time’s the charm: it turns out the last version works perfectly, except for application/json, because of this hardcoded exception just for JSON:

https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/index.js#L1084-L1087

This circumvents that and is just as good:

// get.js
let request = require('superagent');

let pipe = (source, sink) => {
   source.on('data', chunk => {
     console.error('got some data of len ', chunk.length); // DEBUG
     sink.write(chunk)
   })
}

var req = request.get(process.argv[2])
req.buffer(false)
req.parse((res, cb) => {
  if(res.statusCode == 200) {
     console.error(res.status)
     pipe(res, process.stdout)
     res.on('end', () => { cb(null, undefined, null) })
  }
})
req.on('response', res => {
  })
 .catch(err => {
   console.error('error!')
   console.error(err.status, err.message)
  })
1reaction
kousucommented, Jun 30, 2020

Nevermind, my solution doesn’t work. It works on some files, but if Content-Type: application/octet-stream (or presumably one of the other types in:

https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/parsers/index.js#L1-L9

)

then my source.on('data', ...) is ignored and nothing comes out. I expected setting .buffer(false) would overrule the mimetype detection; but it doesn’t, in fact the mimetype detection overrules it at parser = exports.parse[mime]:

https://github.com/visionmedia/superagent/blob/2fcea621c69e3cc779bc59f5f3e8677c2cce8f99/src/node/index.js#L1055-L1082

I tried to circumvent this by setting an explicit parser that did nothing:

req.parse( (res,cb) => null ) 

but this didn’t work and I don’t know why. The mimetype detection is strong and works even when I work against it.


Anyway, I figured out what the reliable thing to do is. As mentioned https://github.com/visionmedia/superagent/issues/1161#issuecomment-361930999 and @julien-f you can replace .then() with req.on('response') to catch the response after the headers are parsed but before the body is processed ((but this only works if you also set .buffer(false))) and in there you can use .on('data') as above; unfortunately, this event happens before the automatic error-checking kicks in, but I dug that out and patched it in too:

// get.js
let request = require('superagent');

let pipe = (source, sink) => {
   source.on('data', chunk => {
     console.error('got some data of len ', chunk.length); // DEBUG
     sink.write(chunk)
   })
}

var req = request.get(process.argv[2])
req.buffer(false)
req.on('response', res => {
  if(request.Request.prototype._isResponseOK(res)) { // this helper is in a funny place
     console.error(res.status)
     pipe(res, process.stdout)
  }
  })
 .catch(err => {
   console.error('error!')
   console.error(err.status, err.message)
  })
Read more comments on GitHub >

github_iconTop Results From Across the Web

Piping data but only on 200 OK · Issue #1575 · ladjs/superagent
I want to download/upload large files with superagent. .pipe() conveniently streams data, but it inconveniently ignores all HTTP errors.
Read more >
vert.x - BodyCodec and pipe method forward status code
The problem is when the file is not found. In this case, the user receives a response with status 200 and status message...
Read more >
Streams, Piping, and Their Error Handling in Node.js - Medium
Also, on the end event of the readStream , we send a 200 OK status code to the client. All this code works...
Read more >
Azure API Management advanced policies | Microsoft Learn
The return-response policy aborts pipeline execution and returns either a default or custom response to the caller. Default response is 200 OK ......
Read more >
Pipes and Pipe Sizing for steam distribution - Spirax Sarco
Only Schedules 40 and 80 cover the full range from 15 mm up to 600 mm ... to select pipe sizes from known...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found