[py-lmdb] Fw: Re: py-lmdb write performance

  • From: "Dinesh Vadhia" <dineshvadhia@xxxxxxxxxxx>
  • To: <py-lmdb@xxxxxxxxxxxxx>
  • Date: Thu, 29 May 2014 02:15:06 -0700

Just noticed that on using create_env or open_env for the first time:

- on Windows, the data.mdb is created with size=map_size
- on Linux, data.mdb is created with size ~12K irrespective of map_size. Once db populatation starts then map_size disk space is allocated.

If so, then why is it taking 2.5 hours to write ~1gb of data?

From: "Dinesh Vadhia" <dineshvadhia@xxxxxxxxxxx>
Sent: Wednesday, May 28, 2014 10:15 AM
To: <py-lmdb@xxxxxxxxxxxxx>
Subject: Re: [py-lmdb] Re: py-lmdb write performance

Looks like the (centos-based) cluster is not creating the lmdb db with the correct map_size. A small map_size works fine but larger ones (eg. > 30gb) creates a 12K db ! Not sure what is going on but the admins are looking into it. Not sure if it is an lmdb or OS problem yet.

From: "David Wilson" <dw@xxxxxxxx>
Sent: Wednesday, May 28, 2014 5:57 AM
To: <py-lmdb@xxxxxxxxxxxxx>
Subject: [py-lmdb] Re: py-lmdb write performance

That's about 120kb/sec, which doesn't sound right.

On Wed, May 28, 2014 at 05:48:47AM -0700, Dinesh Vadhia wrote:
Generated sorted keys to create local dictionaries on each cluster machine;
next, one machine merges each dictionary in sorted order into db; but it
still takes ~2.5 hours to write ~1gb of data with append=True;
doesn't sound right does it or does it?

Other related posts: