Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: .getValueAsStream()

See original GitHub issue

I often shoehorn a humungous list into a single levelup value, yet only need to read the first few elements when it is get-ed. At the moment I have to load the whole value into memory as an Array, before getting the tiny part that I want from the beginning. It would be great if it was possible to store them as strings/blobs, and then stream these large values with something like .getValueAsStream().

Here is somebody else describing this issue, together with a streamy solution for levelDB: http://codeofrob.com/entries/streaming-large-values-from-leveldb.html

Obviously you wouldn’t be able to make a “real” value stream for every *-down, that actually reads a stream of bytes without caching the whole thing in memory. But you could definitely do it for a lot of the major stores, and it would be a really powerful feature.

Thoughts?

Issue Analytics

State:
Created 7 years ago
Comments:14 (2 by maintainers)

Top GitHub Comments

2reactions

chjjcommented, Sep 3, 2016

@fergiemcdowall, here’s an untested example of what I mean by my explanation above: https://gist.github.com/chjj/bc6b30e228f7af93fe4cbd1528a0b71c

That’s the way I would do it (and I more-or-less do it that way in many cases – it works well). I can promise you this method will be faster than adding a bunch of complex behavior to leveldown and a new stream object to levelup.

1reaction

chjjcommented, Sep 3, 2016

The author of the blog post you mentioned is hesitant about putting large values onto a memory managed heap. Luckily, Buffer objects only take up 80 bytes on the JS heap, regardless of their size.

So, this behavior can be easily layered on top of levelup/down without putting heat on the GC.

Store large value list as serialized binary: [varint-size][value1][varint-size][value2][etc...].
Get record as a Buffer object. Iterate over and slice out only the values you need. Deserialize them.

This method is actually superior because it allows arbitrary access to any value in the list (i.e. you can skip over values), as opposed to a stream which would just spit all values out at you in order.

Anyway, I really don’t think levelup should be implementing binary serialization formats for the users.

And this behavior most certainly shouldn’t be in leveldown: I’m not sure what the blog post author was getting at there. Copying data that is already in memory from one place to another in small chunks is a new level of pointless – especially in node.js since we have Buffer objects. It would only add overhead. Maybe C# lacks something similar to node.js Buffers which are stored off the vm heap, but I doubt it. I’m guessing the author just hasn’t thought this through well enough. I’ve never seen any serious project that uses a key value store do this at the iterator level.