[gameprogrammer] Re: sqlite example program

  • From: Bob Pendleton <bob@xxxxxxxxxxxxx>
  • To: gameprogrammer@xxxxxxxxxxxxx
  • Date: Wed, 26 May 2010 16:45:12 -0500

Since no one has mentioned this, I guess I will. The main reason to
use a data base system is ACID.

ACID
ACID short for Atomicity – Consistency – Isolation – Durability and
describes the four properties of an enterprise-level transaction:

    * ATOMICITY: a transaction should be done or undone completely. In
the event of an error or failure, all data manipulations should be
undone, and all data should rollback to its previous state.
    * CONSISTENCY: a transaction should transform a system from one
consistent state to another consistent state.
    * ISOLATION: each transaction should happen independently of other
transactions occurring at the same time.
    * DURABILITY: Completed transactions should remain
stable/permanent, even during system failure.

Insuring that every transaction is fully ACID compliant adds so
complexity and overhead to the code that implements the database. But,
it is absolutely wonderful when your program dies, or your system
dies, and your data base integrity has not been compromised. Nothing
quite as much fun as walking through a database looking for broken
links and orphaned entries added by a transaction that was in play
when the admin tripped over the power cord. :-)

You can not get ACID compliance with any kind of purely in memory data
management system.

Another thing I think I noticed is a bit of confusion about SQL and
databases. SQL is the not nearly Standard enough Query Language and it
is used as an interface to most relation database management systems.
But, if you look around, you'll find that a lot of DBMSes have a C/C++
api that doesn't use a lot of SQL. In fact, some of them don't use any
SQL at all. Then there are the DBMSes that don't use SQL at all. Take
a look at DBM, AKA GDBM. Go to Google and type in "man unix dbm" to
find the man page. You'll probably have to install the package to use
it. GDBM is a replacement for DBM which is a linked in, lives on disk,
completely non SQL DBMS.

The point of all that is that you can have a database with out SQL.
I'm so old that when I took the class on databases in college we spent
half the class on Codasyl databases and the other half on those new
fangled relational databases. We looked at the relational databases
just because my professor thought they might be important some day. We
used Ingres which used its own QUEL query language. Ingres was the
second relational database and did not support SQL. If you want to
keep stuff on disk and can live without guaranteed ACID compliance
look at something like DBM. It is really easy to use and works great.

Now days, we are starting to see the cracks in the relational database
model. RDBMSes are really good up to a certain size. After that they
start to fall apart. The new big deal in DBMSes are the things like
Googles BigTable and hadoop. These are systems that allow simple
tag,value databases to span across thousands of machines. But, I don't
think you are worried about petabyte databases right now. :-)

Getting back to your question. Database management systems are
designed to provide ACID compliance. (At least most of them are.) If
you need to have your data survive a hardware failure or a system
crash you need to use a database. Database servers, even when they are
just an old PC, can be much faster than just keeping the data in
memory if the data is big enough. That is because the data is indexed
and the indexing systems have been tuned and worked over for decades.

The indexing systems are part of the real value of a DBMS. Let's say I
start out needing to store a few under key,value pairs in my array. I
can just do a sequential search to find values, I can add new values
at the end or in empty slots. And I can delete pairs by just setting
their slot to NULL. Works great. Its simple and reliable you can do it
up in a few minutes.

What happens when you need to start storing several thousand items?
Then your simple sequential search starts taking a meaningful amount
of time. When you hit millions (not hard to do on hardware that can
support 8 to 16 gigabytes) that simple search gets to be a real
problem.

So what do you do? Well, you move to a hash table and it gets fast
again. Unless you have multiple copies of the same tag. The simplest
fastest kinds of hash tables don't like multiple identical tags. You
discover then that multmap is your friend, or you wind up writing a
lot of code.

The same thing happens if you have multiple tables with relationships
between the tables. The look ups have to be done for each table and
deletions and insertions have to work across all the different tables.
You wind up implementing most of a database to get this all to work.
Or, you learn a lot about more about the STL than you ever thought you
wanted to know.

I've done it both ways several times. I have a data base that I use to
generate about 8 thousand web pages. I read every table into the
appropriate STL data type and do the work in RAM. It took less than a
minute to run last time I ran it. It took a fair amount of time to
write that code. I did it because when I put the data on a server and
used SQL to do the queries I had to let the job run over night. It was
walking over every entry in the entire database doing all sorts of
weird things including having to resort the data several different
ways. And, at the time I was using MySQL running on a 300 Mhz PII that
I picked up as surplus from some junk store. :-) So, it made sense do
it all in memory.

It took almost no time to code it up using SQL and ODBC to connect to
my MySQL database. OTOH, coding it up in C++ using STL took a while to
do and a while longer to get right. :-)

Basically, if your database is simple, at most a few tables with
simple relationships, the data all fits in memory, and you can live
without ACID, then do it in RAM. You might want to make your life
easier by learning the STL or by finding a nice implementation of a
hash table. If that does not describe your application, use a DBMS.
You don't have to use SQL with your DBMS. Like I said, a lot of DBMSes
have a C/C++ api that you can use instead. If you are somewhere in
between then look at things like good old DBM or Berkeley DB. I do
believe that Berkeley DB may be just what you are looking for, it is
like sqlite but has not SQL.

Bob Pendleton



On Sat, May 22, 2010 at 12:56 AM, Chris Nystrom <cnystrom@xxxxxxxxx> wrote:
> Below is an example c program that I wrote to use the SQLite library
> (http://www.sqlite.org/). It compiles without error under Linux.
>
> SQLite seems to be far more trouble to use than simply creating
> arrays. I assume that as you scale up though, it is important for
> performance to have an actual database instead of internal arrays of
> data. Is there a rule of thumb about when you should use a database?
> Or is there some other reason to use SQL besides performance with
> large systems?
>
> Chris
>
> --
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sqlite3.h>
>
> sqlite3 *db;
>
> void open_db(char *db_name)
> {
>    if (sqlite3_open(db_name, &db)) {
>        fprintf(stderr, "Can't open database: %s\n", sqlite3_errmsg(db));
>        sqlite3_close(db);
>    }
> }
>
> char **result;
> int nrow = 0;
> int ncol = 0;
>
> int db_exe(char *s_exe)
> {
>    int rc;
>    char *zErrMsg = 0;
>
>    rc = sqlite3_get_table(db,  /* An open database */
>                           s_exe,       /* SQL to be executed */
>                           &result,     /* Result written to a char *[] */
>                           &nrow,       /* Number of result rows */
>                           &ncol,       /* Number of result columns */
>                           &zErrMsg     /* Error msg written here */
>        );
>
>    return rc;
> }
>
> void db_close(void)
> {
>    sqlite3_close(db);
> }
>
> void get_data(void)
> {
>    int i, j;
>
>    if (nrow > 0) {
>
>        printf("nrow = %d\n", nrow);
>        printf("ncol = %d\n\n", ncol);
>
>        /* header */
>        for (i = 0; i < ncol; ++i)
>            printf("%s\t", result[i]);
>        printf("\n");
>
>        /* data rows */
>        for (i = 0; i < nrow; i++) {
>            for (j = 0; j < ncol; j++) {
>                printf("%s\t", result[(ncol * (i + 1)) + j]);
>            }
>            printf("\n");
>        }
>    }
>
>    sqlite3_free_table(result);
> }
>
> int main(int argc, char **argv)
> {
>    char *db_name = "test.db";
>
>    /* remove old db if it exists */
>    remove(db_name);
>
>    /* create db */
>    open_db(db_name);
>
>    /* create emp table */
>    db_exe("create table emp (name varchar(15),age int,weight double)");
>
>    /* insert data into tables */
>    db_exe("insert into emp (name,age,weight) values ('Bob',47,172)");
>    db_exe("insert into emp (name,age,weight) values ('Sue',38,134)");
>
>    /* show data in emp table */
>    db_exe("select * from emp");
>
>    get_data();
>
>    return 0;
> }
>
> --
> E-Mail: Chris Nystrom <cnystrom@xxxxxxxxx>
> Saving the world from web programming.
> http://www.newio.org - G-Talk: cnystrom
>
> ---------------------
> To unsubscribe go to http://gameprogrammer.com/mailinglist.html
>
>
>



-- 
+-----------------------------------------------------------
+ Bob Pendleton: writer and programmer
+ email: Bob@xxxxxxxxxxxxx
+ web: www.TheGrumpyProgrammer.com

---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html


Other related posts: