Feature request: .getValueAsStream()
See original GitHub issueI often shoehorn a humungous list into a single levelup value
, yet only need to read the first few elements when it is get
-ed. At the moment I have to load the whole value into memory as an Array, before getting the tiny part that I want from the beginning. It would be great if it was possible to store them as strings/blobs, and then stream these large values with something like .getValueAsStream()
.
Here is somebody else describing this issue, together with a streamy solution for levelDB: http://codeofrob.com/entries/streaming-large-values-from-leveldb.html
Obviously you wouldn’t be able to make a “real” value stream for every *-down, that actually reads a stream of bytes without caching the whole thing in memory. But you could definitely do it for a lot of the major stores, and it would be a really powerful feature.
Thoughts?
Issue Analytics
- State:
- Created 7 years ago
- Comments:14 (2 by maintainers)
Top GitHub Comments
@fergiemcdowall, here’s an untested example of what I mean by my explanation above: https://gist.github.com/chjj/bc6b30e228f7af93fe4cbd1528a0b71c
That’s the way I would do it (and I more-or-less do it that way in many cases – it works well). I can promise you this method will be faster than adding a bunch of complex behavior to leveldown and a new stream object to levelup.
The author of the blog post you mentioned is hesitant about putting large values onto a memory managed heap. Luckily, Buffer objects only take up 80 bytes on the JS heap, regardless of their size.
So, this behavior can be easily layered on top of levelup/down without putting heat on the GC.
[varint-size][value1][varint-size][value2][etc...]
.This method is actually superior because it allows arbitrary access to any value in the list (i.e. you can skip over values), as opposed to a stream which would just spit all values out at you in order.
Anyway, I really don’t think levelup should be implementing binary serialization formats for the users.
And this behavior most certainly shouldn’t be in leveldown: I’m not sure what the blog post author was getting at there. Copying data that is already in memory from one place to another in small chunks is a new level of pointless – especially in node.js since we have Buffer objects. It would only add overhead. Maybe C# lacks something similar to node.js Buffers which are stored off the vm heap, but I doubt it. I’m guessing the author just hasn’t thought this through well enough. I’ve never seen any serious project that uses a key value store do this at the iterator level.