Write Performance
See original GitHub issueHello all,
I fear that i’m not using the library correctly. Based on a benchmark review, i should be executing more writes / second.
I’m currently writing at 40,000 vals/sec.
I have tried bulking the data and using the Agrona DirectBuffers, however i see no performance increase.
Below is the code which i’m using for a performance test.
import org.lmdbjava.Cursor;
import org.lmdbjava.Dbi;
import org.lmdbjava.DbiFlags;
import org.lmdbjava.Env;
import org.lmdbjava.EnvFlags;
import org.lmdbjava.Txn;
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
import static java.nio.charset.StandardCharsets.UTF_8;
public class EfficentTest {
public static void main(String[] args) throws IOException
{
final File path = new File("/foo");
if (!path.mkdirs() && !path.exists())
{
throw new IOException("Unable to create: " + path);
}
Env<ByteBuffer> env = Env.create()
.setMapSize(1L << 31)
.setMaxDbs(2)
.open(path, EnvFlags.MDB_NOSYNC);
final Dbi<ByteBuffer> names = env.openDbi("names", DbiFlags.MDB_CREATE);
final ByteBuffer key = ByteBuffer.allocateDirect(4);
final ByteBuffer val = ByteBuffer.allocateDirect(1024);
final long t0 = System.currentTimeMillis();
long tn = t0;
for( int i = 0; i < 1000000 ; i++) {
try (Txn<ByteBuffer> txn = env.txnWrite()) {
final Cursor<ByteBuffer> c = names.openCursor(txn);
key.putInt(0, i);
val.put("Hello world".getBytes(UTF_8)).flip();
c.put(key,val);
txn.commit();
}
if (i % 1000 == 0)
{
long taken = System.currentTimeMillis() - tn;
System.out.printf("Inserted: %d rows at %,2f vals/sec%n", i, (1000 * 1000D) / taken);
tn = System.currentTimeMillis();
}
}
final long t1 = System.currentTimeMillis();
System.out.printf("Time to load db: %,dms%n", (t1 - t0));
}
}
Any help would be much appreciated.
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
6 Tips for Writing an Effective Performance Review
6 Tips for Writing an Effective Performance Review · 1. Provide regular, informal feedback. · 2. Be honest. · 3. Do it face...
Read more >How to Write More Effective Performance Review Comments
5 Steps for Writing Effective Performance Review Comments · 1. Be positive and comprehensive. · 2. Share specific feedback and provide examples.
Read more >100 Useful Performance Review Phrases - TINYpulse
100 performance appraisal phrases to coach and recognize your employees. Example phrases for problem-solving, communication, productivity, ...
Read more >How to Write an Effective Performance Review - LinkedIn
Tips for Writing Effective Performance Reviews · Collect Insights and Data · Emphasize the Goals and Expectations of a Company · Be Positive...
Read more >Simple guide to writing performance reviews
How to write review an effective performance review · Give specific examples · Provide constructive feedback · Set realistic goals · End with ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
A transaction can only be reused if it’s a read-only transaction. A read-write transaction can only be committed once. You need to decide whether your use case can be modelled as a single read-write transaction (for performance) or you need to use lots of read-write transactions.
In most of my LMDB workloads I have a single read-write transaction that runs for long periods (eg in some cases 24 hours as that’s how often the underlying input files roll over). This minimises wasted space (as a read transaction concurrent with a write transaction will cause file growth) while maximising throughput. It’s not for everyone though. My usage works because the underlying input files can be used to regenerate the LMDB database from scratch when something goes wrong (OS crashes, Java-side bugs, invalid input files etc) and there is only a single JVM application accessing each file (so the DAO or service holds onto the single read-write
Txn
instance and routes all read and write operations through it).What is best depends on each application and the trade-offs you’re willing to live with. If using LMDB as a system of record you’ll probably want to use individual transactions per logical change. But if using LMDB as a very high performance durable sorted map, chances are you’ll use fewer transactions and instead accept a recovery strategy (rebuild from what your data input is, discard data since the last commit etc).
@scottazord I ran your code locally and made some slight adaptions:
This gives:
It’s always best to call
Txn.commit()
as infrequently as possible. So let’s make some minimal changes to see the impact of doing that:This gives:
Both tests resulted in the same sized database directory.
Of course a suitable batch size depends on your use case. But you gain more than an order of magnitude throughput in this simple test (177K tps to 2.5M tps).
As an aside, a test like this one will result in the LMDB C library detecting you are inserting values in increasing order and then it does some page size and B+ Tree optimisations. So when writing benchmarks please be sure to test on a dataset that is representative of your particular use case. To illustrate how much this differs the results, if I started the batch at 1,000,000 and decrement to 0 it dropped to 2.0M TPS with a final database 50% larger. Again, just be sure to test with data representative of your use case.