[gameprogrammer] Re: sqlite example program

  • From: Bob Pendleton <bob@xxxxxxxxxxxxx>
  • To: gameprogrammer@xxxxxxxxxxxxx
  • Date: Thu, 27 May 2010 21:33:55 -0500

On Sat, May 22, 2010 at 6:31 AM, Ethan Grammatikidis
<eekee57@xxxxxxxxxxx> wrote:
>
> On 22 May 2010, at 06:56, Chris Nystrom wrote:
>>
>> SQLite seems to be far more trouble to use than simply creating
>> arrays. I assume that as you scale up though, it is important for
>> performance to have an actual database instead of internal arrays of
>> data. Is there a rule of thumb about when you should use a database?
>> Or is there some other reason to use SQL besides performance with
>> large systems?
>
> I've just begin a very long project to gather data on just this sort of
> question, just when where and how is the right way to use each available
> technology. An inevitable side effect will be finding out if some
> technologies are worth anything at all.
>
> As to SQL itself I haven't anything definite yet, just an impression that
> for a straightforward key:value mapping it may not be worth using at all.
> Filesystems provide key:value as name:file, and am I right in thinking that
> file access is generally faster than database access? Last I heard (about a
> year ago) MySQL was much higher performance than SQLite, and I don't think
> MySQL can't match speed with a filesystem lookup. (How can it, when it uses
> the filesystem as a backend anyway?)

This is a good point. For simple key values pairs the file system
looks like a nice alternative to using a DBMS. In fact, a DBMS may use
separate files to store what are called blobs (Binary Large OBjects).
There are a number of problems with using the file system this way.
One is that different files systems have different limits. NTFS lets
you have roughly 4 billion files in a directory. That seems like
enough for small data bases. But, you only get a 255 character tag.
That might  be enough, or it might not. On Linux, the limits depend on
which file system you use. It is hard to find exact values. The
maximum number of subdirectories seems to range form 32k to 64k. NTFS
uses a B+ tree to store the index. Not bad. Some Linux files systems
use versions of B trees and also very high performance H trees. But,
by default, a lot of them only use those fancy indexes if you have run
the correct configuration command. Because there are rarely more than
a few hundred files or subdirectories in a directory the Unix/Linux
files systems save a lot of space by just using a sequential look up.

Do you need duplicate keys? If yes, then you can not use the file
system because you can't have two files with the same name.

Then there is the problem of block size. That is the minimum amount of
information allocated for a file. You can count on it being at least
1K. Newer files systems are going to 4k. So, if you are storing 47
bytes of data you use a minimum of 1K, plus a fairly heavy file name
record. In a world with Terabyte drives you might not worry about
that.

So, yeah. Some times this is a good idea. Even a very good idea. If
you applications fits these requirements.

A DBMS can easily use a better indexing algorithm. It can use one
tuned to the information it has about each table that is being
indexed. Each data field in a record can be stored in a very efficient
manor so that you can fit a lot more information on each drive.

A DBMS uses the file system to open files and to read an write files.
But, it does not make use of the file systems indexing system to look
up records. The indexing systems in file systems are designed to be
fast enough so that files can be found and opened quickly.  But, a a
program normally doesn't have that many files open at once. It doesn't
open that many files during the run time of a program. If you use the
file system as a database then the program will be open an closing a
lot of files.

I would write a test program and see how long it takes to open and
close a file on your system. I would see how many files I can create
in a single subdirectory. And, I would make sure that I knew what was
on my customers machines. This is a much better idea on  NTFS than on
FAT32. It is a really bad idea on FAT32. It might be a good idea on
Linux depending on the file system and how it is tuned.

Like I said, this is not a bad idea. It could be a very good idea.
But, it could really kind of suck too. So, use it if it works for your
application.

Bob Pendleton




>
> --
> Simplicity does not precede complexity, but follows it. -- Alan Perlis
>
>
> ---------------------
> To unsubscribe go to http://gameprogrammer.com/mailinglist.html
>
>
>



-- 
+-----------------------------------------------------------
+ Bob Pendleton: writer and programmer
+ email: Bob@xxxxxxxxxxxxx
+ web: www.TheGrumpyProgrammer.com

---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html


Other related posts: