What if number of DBs is unknown?
See original GitHub issueBig fan of the project. Thanks for making such a great persistence solution!
Question: Normally when you create an Environment you must specify the number of DBs eg:
final Env<ByteBuffer> env = create()
// LMDB also needs to know how large our DB might be. Over-estimating is OK.
.setMapSize(10_485_760)
// LMDB also needs to know how many DBs (Dbi) we want to store in this Env.
.setMaxDbs(1)
We’re leaning towards a model (data is very unstructured) that calls for creating a DBI for an unknown number of entities. Entities are identified by UUIDs and there might be 1 persisted… or millions.
The simplest thing to do might be to simply call setMaxDbs(Integer.MAX) (or some other very high number).
Is this okay?
What might be the impact on performance due to passing a very high number to maxDbs?
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
True or False: Revealing the Myths Around DBS Checks
DBS Checks are an unknown entity to many and as a result, we hear some weird and wonderful myths about them. DBS Checks...
Read more >Criminal Record Checks - Rethink Mental Illness
You can contact the Disclosure and Barring Service (DBS) if there is incorrect information on your DBS certificate. You can ask for the...
Read more >Unexpected Complications of Novel Deep Brain Stimulation ...
It carried the known risks associated with standard DBS procedures together with a number of unknown risks that by their nature were unpredictable....
Read more >Deep Brain Stimulation (DBS) FAQ - Boston Scientific
Your information has been sent to one of our DBS Specialists. Pro tip: During the time you requested, be sure to answer all...
Read more >How do I get my money back after I discovered an ...
Contact your bank or credit union immediately if you suspect an ... give you a tracking number, and keep you updated on the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Don’t be. Just preallocate a re-usable key buffer and use offsets. For example, if you needed an integer prefix and then a UUID, you’d need a key buffer of 12 bytes. At offset 0 you’d write the integer, and at offset 4 you’d write the UUID. This is incredibly inexpensive, as the entire 12 bytes will be in the same cache line (on that issue you should allocate a 64 byte buffer but only use the first 12 bytes), and you can use a bounds-check disabled buffer (eg Agrona
UnsafeBuffer
) if you were really concerned. As an aside Agrona has static methods that hide the cache line (eg allocateDirectAligned can be used withorg.agrona.BitUtil.CACHE_LINE_LENGTH
).If your keys are somewhat complex, I use and recommend Simple Binary Encoding. SBE generates encoder and decoder flyweights that can also handle schema changes, recurring fields, user-defined types / enums etc. It shields you from needing to keep track of field offsets and imposes no overhead compared with hand-written code. SBE is overkill if you have simple needs though. We use our own DAO code generator (which uses the offset approaches for automatically-encoded primary keys and index keys) and SBE for the value encoding.
Our generated DAO pattern has found the transaction capabilities of LMDB makes it quite simple to maintain multiple index tables. A typical pattern for a generated update method is basically:
Sure that’s somewhat inefficient for writes because often we drop an index record and the insert exactly the same index record again, but the trade-off there is we’re using a code generator so we want the generator to be correct and maintainable. Furthermore to be using LMDB at all means your use case is read-optimised, and writes are slower. If you need to be write-optimised and care less about reads, you’re probably better off with a log structured merge type system such as LevelDB or its derivatives.
I share this because it sounds like you’re trying to write a “generic” store, so having a way to differentiate your “index” keys from the “primary” keys is generally necessary. To finish off my DAO example, this shows how we do that:
Note the
fw
attribute is an SBE-generated flyweight, and thepk
andkey
fields are found in the flyweight. Everything else is handed via runtime code generation. Our pattern is edit an SBE XML file, generate, write an annotated DAO interface, then use the annotated DAO interface with its runtime implementation. It’s very rare to need to deal with LmdbJava APIs directly from domain code (the exception is our time series data, as that’s 99% of our data volume and has extremely challenging storage and performance requirements).@krisskross noted the fundamental piece about a
Txn
cannot crossEnv
s. You can, if desired, also have differentEnv
s in the same directory (although different physical data and lock files) if you use MDB_NOSUBDIR.@benalexau Thanks very much. You’ve given us a lot to think about. Definitely like the idea of a metamodel that links primary keys and indexes.