[py-lmdb] Re: Multiple environments to avoid write lock?

  • From: Jonatan Heyman <jonatan@xxxxxxxxxxx>
  • To: py-lmdb@xxxxxxxxxxxxx
  • Date: Wed, 4 Feb 2015 12:32:50 +0100

Hi David!

Thanks a lot for your input! I realise that I should have included more
info in my original e-mail.

The app is a service that will back a web app. Reads will be performed on
page requests, but (almost) all writes will be performed in background
jobs. My app uses two main databases that will have the most read/writes.
Both uses int64s as keys, and a list of int64s as values (using
dupsort=True).

The first DB maps object IDs to a list of user IDs which are associated
with that object. The user ID list ranges in length between zero and a
couple of thousands, and averages at about 120. The total number of unique
keys in this DB is currently a little over 3 millions (so around 360
millions of user ID values), but is expected to grow 20x the upcoming
months.

The second DB maps user IDs to the list of object IDs that they are
associated with. The total number of unique keys in this DB is currently
about 70 million. The average number of values per key is about 5.

The typical write scenario is where my service receives an object ID and a
list of user IDs for that object. It will then construct one list of
(object_id, user_id) tuples, and one list of (user_id, object_id) tuples,
and then use putmulti() to write this data to the two databases. The
service will also do this for existing object IDs, in which case it will
also remove user IDs and object IDs (that was previously present, but no
longer exists in the user ID list that was passed in) from the values in
both databases.

In the typical read scenario the service will read all user IDs for an
object ID, and then it will retrieve the list of object IDs for each of
these user IDs.

The reads and writes will be pretty random, and there will be many more
reads than writes.

I'm not sure if I'll need MDB_SYNC. To loose the last minute of data in
case of a crash would be acceptable, but getting a corrupt database would
not. The documentation (
https://lmdb.readthedocs.org/en/release/#environment-class) seems to
suggest that using sync=False would be okay as long as I also use
writemap=False, and use a filesystem that preserves write order. Which
filesystems preserves write order? How about ext4?

To me, it seems like the only drawback I would get from using different
environments would be that I might get the two DBs in an inconsistent state
in case of a crash. Also I might get inconsistent backup dumps. This I
could probably live with, in case it would greatly increase my write
performance.

In an ideal world I would be able to write extensive load tests that I
could use for profiling different options, but as the only developer on a
project with scaling pains, I'm aiming at "profiling it in production (TM)"
(running side by side with the current system).

Again, thanks a lot for your input. I've tried to provide as much details
as possible about the system, and if you have any additional thoughts, I'd
appreciate it greatly!

Best,
Jonatan



On Tue, Feb 3, 2015 at 9:03 PM, David Wilson <dw@xxxxxxxx> wrote:

> Hi Jonatan!
>
> There's not quite enough info to work with here, and probably it's
> better if you just profile how your app behaves first. For example,
>
> * if you're writing all those keys from a sequential range of keys, then
>   that will perform much better than random completely random keys. If
>   your write txns are fast enough, then probably it's not worth the
>   effort of splitting your data into separate environments. Often
>   sequential write is faster by a factor of 2 or more.
>
> * Are your keys/values huge or tiny? To get an intuitive feel for
>   performance, it's sometimes helpful to think of txn size in terms of
>   total MB read/written.. this is especially true when the entire DB is
>   cached or on very fast media. Writing 100,000 8 byte key + 8 byte
>   values (1.52MiB) is going to be much faster than say, 512 byte values
>   (49.59MiB).
>
> * if you're overwriting existing keys during the write txn, then that
>   changes things a lot too, since DB growth will be bounded.
>
> * Similarly, the presence/absense of MDB_SYNC (if suitable for your app)
>   drastically changes write performance characteristics.
>
> * How many concurrent writers do you need? Is it a web app? Another
>   option is to push your large writer off to a task queue.
>
> * Perhaps tweak examples/dirtybench.py from the Git repository to create
>   keys/values roughly of your size. This will produce some nice example
>   numbers for you to work from for your hardware/disk combination.
>
> Drop me a line if you need some more ideas, but probably it's best for
> you to just write some throwaway code to profile your specific use case.
>
>
> David
>
> On Tue, Feb 03, 2015 at 08:50:43AM -0800, Jonatan Heyman wrote:
> > Hi!
> >
> > I'm new to LMDB and python-lmdb, but so far I find it really promising
> and well
> > documented.
> >
> > I'm writing an app where I plan to use LMDB to store records in three
> different
> > named databases. One action in my app might result in hundreds or
> thousands of
> > read/writes to these databases. If I've understood LMDB correctly, there
> is a
> > global write lock for each Environment. Would it then make sense for me
> to
> > store my different databases in three different Environments, in order to
> > decrease the time spent waiting for write locks?
> >
> > Best,
> > Jonatan
>
>

Other related posts: