Using node streams with highland
See original GitHub issueI often need to use node-csv and JSONStream for processing large files. What is the best way to use these node-style streams with highland so that back-pressure is managed properly?
highlandStream.pipe(csvStream)
returns the destination stream which isn’t a highland stream so I can’t continue chaining.
I was finally able to get it to work, but it wasn’t easy:
// Use _(source) to create a generator
function getFeatures(filename){
var _push, _next;
// Setup JSONStream that generates many events
var featureStream = fs.createReadStream(path.join(sourceDir, filename))
.pipe(jsonStream.parse(['features',true]))
.on('data', function(feature){
// Pause the stream until the generator is called again
// to manage back-pressure properly
featureStream.pause();
_push(null, feature);
_next();
})
.on('end', function(){
_push(null, _.nil);
_next();
});
return _(function(push, next){
_push = push;
_next = next;
// Resume the stream to get the next data event from the json stream
featureStream.resume();
});
};
Besides being a little difficult to setup, it can only be used at the beginning of a stream. If I want to process multiple files this way then I have to concoct another hairy beast that enables each event to spawn a new stream and only thunk when that new stream is done.
_(filenames).consume(function(error, filename, outerPush, outerNext){
if(filename === _.nil){
outerPush(null, _.nil);
outerNext();
} else {
getFeatures(filename)
.consume(function(error, feature, innerPush, innerNext){
if(feature === _.nil){
innerPush(null, _.nil);
innerNext();
// Push the filename out so that we can thunk
// and get the data moving
outerPush(null, filename);
// Let the outer stream know we're done
outerNext();
} else {
innerPush(null, feature);
innerNext();
}
})
.each(function(feature){
// Need to call this to thunk and get data moving
});
}
})
.each(function(filename){
// Need to call this to thunk and get data moving
});
Is there a better way to handle these situations with the current highland api?
Either way, highland is still making our life a lot easier. Thanks for creating it.
Issue Analytics
- State:
- Created 10 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
@justincy yes, you’ll get buffers from fs.createReadStream, if you want strings then convert them to a string by calling
.toString()
on the buffers:Oh, right, thanks.