[yunqa.de] Re: SQLite3 - EOutOfMemory when concatenating strings

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Fri, 06 Feb 2009 11:16:00 +0100

Clyde England wrote:

>Basically, the problem occurs when building a string in memory which is just 
>the concatenation of a variable from several records of separate SQL Queries. 
>However, it only happens when you apply the Pragma "PRAGMA cache_size = 20000" 
>
>I had originally put in this statment to help speed up the database, 

Generally speaking, your are right in saying that PRAGMA cache_size can speed 
up database operation. But performance improvements do vary depending on the 
database schema (indexed vs. non-indexed) and operation (reads vs. writes).

Your example code performs a sequential table scan. Each record has to be read 
into memory. In this context, it does not help much if pages of already used 
records are still kept around - they just occupy memory if not used again later.

So unless you will reuse those old record pages in later DB operations, a 
larger cache_size does not necessarily speed up your DB access.

>and my understanding is that this would add about 30mb (20000 x 1.5k) memory 
>overhead to the application. I thought this to be quite small considering the 
>amount of memory in computers these days.

True again, but you forgot to take into account memory fragmentation. 20000 
database pages require 20000 small memory allocations. Unless the memory 
manager can find an empty slot of the required size, it will use a larger slot, 
taking up more memory than just the 20000 x 1.5k.

Additionally, there is a small memory overhead for page maintainance, so your 
application will end up using more than just the 20000 x 1.5k memory. But this 
still is not the whole story ...

>However, it appears this Pragma has a very strange effect on string 
>concatenation in Delphi. Removing this Pragma (which I believe then uses the 
>default value of 2000) no longer causes a problem.
>
>The following is the small snipped of code that will produce the out of memory 
>exception. Some extra notes that might help:
>
>1. The sample code consistently throws the out of memory error after about 
>3000 iterations
>2. The "CacheMemo" Table being queried is nothing special, with the field 
>(column_str16(1)) being text entries ranging from about 100 to 5000 bytes (5k)
>3. The line of code that causes the "out of memory error" is: work := work + 
>stmt.column_str16(1);

This line is evil code because it involves a lot of behind-the-scenes memory 
allocation and reallocation. As a rule of thumb, never concat large strings 
3000 times or more using the ordinary string + string syntax. Delphi's string 
handler simply does not work well with huge strings.

On top of this, you are assigning a WideString (column_str16) to an AnsiString. 
This again involves expensive behind-the-scenes processing, namely character 
conversion this time.

All this drastically slows down your appliation to a point where the increased 
cache_size becomes completely useless. Just run your code without the string + 
string concatenation and you will see that it performs in just a second or two 
instead of a minute or more (my timings, your timings may vary).

>4. I have debugged the code and just before the crash have determined the size 
>of the field "work" is only a  mere 3mb in size. Delphi string variables 
>should be allowed to grow to 2gb
>5. Removing the "PRAGMA cache_size = 20000" allows this code to run without 
>crashing. In fact, I have run the code on a larger database with over 50,000 
>entries and it runs OK. The moment I add ""PRAGMA cache_size = 20000", it 
>crashes after about 3000 iterations again.
>
>Why should this Pragma effect the way strings are concatenated and throw this 
>error?
>
>Is this a known bug/problem?

Yes and no. It is a known problem of Delphi's string handling, but not of 
DISQLite3 and your cache_size setting.

Possible solutions:

1. Avoid extensive string + string concatenations. Instead, use smart string 
concatenation functions as available in my DIUtils.pas (see source code 
comments of attached project file for details). This avoids the EOutOfMemory 
error and improves performance significantly.

2. Change your works variable from AnsiString to WideString. My tests show that 
this helped against the out of memory error, but not again slow performance.

3. Use a better memory manager like FastMM. In my testing, this helps to avoid 
the EOutOfMemory exception but does not improve performance.

I attached my console test project so you may experiment with the different 
scenarios discussed and suggested above.

Ralf 
{ Copyright (c) 2009 Ralf Junker, The Delphi Inspiration <delphi@xxxxxxxx> }

program DISQLite3_String_Concat;

{$APPTYPE CONSOLE}
{$I DICompilers.inc}

uses
  { Fast Memory Manager is an improved memory manager. }
  //FastMM4,
  { DIUtils.pas contains fast ConCatStr functions. It is MPL licensed and is
    part of various DI packagses (DIContainers, DIRegEx, DIHtmlParser,
    DIZipWriter). DIUtils.pas comes with full code code included. }
  DIUtils,
  SysUtils, DISQLite3Api, DISQLite3Database;

const
  REC_COUNT = 10000;

procedure FillDb(const DB: TDISQLite3Database);
var
  a: array[1..5 * 1024] of AnsiChar;
  i: Integer;
  Stmt: TDISQLite3Statement;
begin
  RandSeed := 0; // Ensure that random string length are reproducable.
  FillChar(a, SizeOf(a), 32);
  DB.StartTransaction;
  try
    DB.Execute('create table CacheMemo (ID integer primary key, t text)');
    Stmt := DB.Prepare('insert into CacheMemo (t) values (?)');
    try
      for i := 1 to REC_COUNT do
        begin
          Stmt.Bind_Text(1, @a[1], 100 + Random(4900), SQLITE_STATIC);
          Stmt.StepAndReset;
        end;
    finally
      Stmt.Free;
    end;
  finally
    DB.Commit;
  end;
end;

//------------------------------------------------------------------------------

{ Original. Runs out of memory because of memory fragmentation. }
procedure test(const DB: TDISQLite3Database);
var
  x, i: Integer;
  { Was just string, but change to explicit AnsiString for Delphi 2009. }
  work: AnsiString;
  Stmt: TDISQLite3Statement;
begin
  DB.Execute('PRAGMA cache_size = 20000;');
  work := '';
  x := 0;
  try
    while x < 10000 do
      begin
        Inc(x);
        Stmt := DB.Prepare16('select * from cachememo where rowid = ' + 
IntToStr(x));
        try
          i := Stmt.Step;
          if i = SQLITE_ROW then
            begin
              work := work + Stmt.Column_Str16(1);
            end;
        finally
          Stmt.Free;
        end;
      end;
  except
    on e: Exception do
      begin
        WriteLn(e.Message);
        WriteLn('Iteration: ', x);
        WriteLn('Length of String: ', Length(work));
        ReadLn;
      end;
  end;
end;

//------------------------------------------------------------------------------

{ Uses WideString instaead of AnsiString to avoid one cycle of memory
  allocation and character conversion. }
procedure test_WideString(const DB: TDISQLite3Database);
var
  x, i: Integer;
  { Changed to WideString instead of string (which is AnsiString in D5). }
  work: WideString;
  Stmt: TDISQLite3Statement;
begin
  DB.Execute('PRAGMA cache_size = 20000;');
  work := '';
  x := 0;
  try
    while x < 10000 do
      begin
        Inc(x);
        Stmt := DB.Prepare16('select * from cachememo where rowid = ' + 
IntToStr(x));
        try
          i := Stmt.Step;
          if i = SQLITE_ROW then
            begin
              work := work + Stmt.Column_Str16(1);
            end;
        finally
          Stmt.Free;
        end;
      end;
  except
    on e: Exception do
      begin
        WriteLn(e.Message);
        WriteLn('Iteration: ', x);
        WriteLn('Length of String: ', Length(work));
        ReadLn;
      end;
  end;
end;

//------------------------------------------------------------------------------

{ Us ConCatStr functions: Very fast and no out of memory exceptions. }
procedure test_ConCatStr(const DB: TDISQLite3Database);
var
  x, i: Integer;
  work: WideString;
  lWork: Cardinal;
  Stmt: TDISQLite3Statement;
begin
  DB.Execute('PRAGMA cache_size = 20000;');
  work := '';
  lWork := Length(work);
  x := 0;
  try
    while x < 10000 do
      begin
        Inc(x);
        Stmt := DB.Prepare16('select * from cachememo where rowid = ' + 
IntToStr(x));
        try
          i := Stmt.Step;
          if i = SQLITE_ROW then
            begin
              ConCatStrW(Stmt.Column_Str16(1), work, lWork);
            end;
        finally
          Stmt.Free;
        end;
      end;
    SetLength(work, lWork);
  except
    on e: Exception do
      begin
        WriteLn(e.Message);
        WriteLn('Iteration: ', x);
        WriteLn('Length of String: ', Length(work));
        ReadLn;
      end;
  end;
end;

//------------------------------------------------------------------------------

var
  DB: TDISQLite3Database;
begin
  try
    DB := TDISQLite3Database.Create(nil);
    try
      DB.DatabaseName := 'test.db3';

      try
        DB.Open;
      except
        DB.CreateDatabase;
        FillDb(DB);
      end;

      // test(DB);
      // test_WideString(DB);
      test_ConCatStr(DB);

    finally
      DB.Free;
    end;

  except
    on e: Exception do
      WriteLn(e.Message);
  end;

  WriteLn;
  WriteLn('Done - Press ENTER to Exit');
  ReadLn;
end.

Other related posts: