Hi DavidDo you remember from a few months back I was trying to get LMDB to work across network attached storage and got into a right mucking fuddle? Anyway, moving on ...
On a multi-core machine and/or cluster of multi-core machines with network attached storage, would it be possible for each core to run an LMDB db instance in-memory only. The data load into the LMDB on each core can take place at startup. Alternatively, pre-create the LMDB db for each core and store on the network storage; at startup, load the db into each core memory. At most each core LMDB db would have 1m records. Hope this makes sense!
Best ... Din -------------------------------------------------- From: "David Wilson" <dw@xxxxxxxx> Sent: Thursday, May 29, 2014 8:35 AM To: <py-lmdb@xxxxxxxxxxxxx> Subject: [py-lmdb] Re: py-lmdb write performance
On Thu, May 29, 2014 at 07:46:29AM -0700, Dinesh Vadhia wrote:- Next, one machine writes each dictionary data to lmdb on filesystem across network which takes ~2.5 hours per dictionary.It sounds like you are still opening an LMDB database over a networked filesystem. Just to make this clear: mounting an NFS / SMB / CIFS / Ceph filesystem then calling "lmdb.open(/path/to/that/filesystem)" is slow and fundamentally broken, you should never do it. If you are experiencing slowness in this configuration, it is because this configuration is slow and fundamentally broken. As previously discussed, you should stream the database over the network using some alternative means to the machine that will open the LMDB database on its local disk. Alternatively you could export the volume over e.g. iSCSI or ATAoE, neither of which suffer the caching and coherency problems of NFS.Attached are output for dirtybench.py from Windows and Linux.Are you experiencing the problem on Linux or Windows? I cannot tell if your host environment is Windows or Linux - I asked for dirtybench output only from the slow environment. I really cannot help if you do not answer my questions or pay attention to my responses.> * Are you still using a network filesystem? We already know that is > broken > * What OS? > * What filesystem? > * What host machine? > * Does your job start fast, and then slow down? If so, is your > dataset larger than RAM? > * Are there any other users of the machine that might cause it to be > slow? > * How large are your transactions? (how many records / how many GB). > * Have you tried splitting your writes into smaller txns?You did not answer these questions