[py-lmdb] Re: py-lmdb write performance

  • From: "Dinesh Vadhia" <dineshvadhia@xxxxxxxxxxx>
  • To: <py-lmdb@xxxxxxxxxxxxx>
  • Date: Tue, 27 May 2014 05:26:56 -0700

Hi David

That pretty much explains why it hasn't been working!

When you say "Instead you should probably wrap the database in an application-layer protocol that hides the file access behind a single small request/response roundtrip." - does such a layer exist or will a custom one have to be written?

Best ...
Dinesh

--------------------------------------------------
From: "David Wilson" <dw@xxxxxxxx>
Sent: Tuesday, May 27, 2014 4:56 AM
To: <py-lmdb@xxxxxxxxxxxxx>
Subject: [py-lmdb] Re: py-lmdb write performance

Hey Dinesh,

It is almost certainly unsafe to use LMDB over the network from multiple
clients, and even if it were safe, the performance is going to suck..

* The lock file will record incorrect information, assuming it
 does not become corrupt

* Each random page-in will involve at least 1 roundtrip and at least 4
 frames (1 tx, 3 rx), perhaps unless you've played with the MTU for
 your network segment.

* Each random page-out will involve similar numbers

More generally, the bus speed of ethernet is vastly higher latency
(500usec vs. 0.1usec) and vastly lower bandwidth than RAM (320+Gbit/sec
vs. 1Gbit/sec).

Accessing the raw database file over a network may incur this penalty
for every page of the file needing to be accessed. Instead you should
probably wrap the database in an application-layer protocol that hides
the file access behind a single small request/response roundtrip.


David

On Tue, May 27, 2014 at 04:07:55AM -0700, Dinesh Vadhia wrote:
Hi! The write performance from machines on a fast network cluster to a file
system is not that great.  The write code is:

        with env.begin(write=True) as txn:
            txn.put('a', 'b')
The hardware and network will impact performance but is it also because lmdb is
not geared for distributed computing?

Best ...
Dinesh




Other related posts: