[delphizip] Re: Windows 95 Compatibility [continuing]

  • From: R Peters <rpeters@xxxxxxxxxxxxx>
  • To: delphizip@xxxxxxxxxxxxx
  • Date: Thu, 30 Sep 2010 09:28:35 +1000

Unfortunately when searching file names within a zip most changes are late
in the name (after the path) which makes the hash worthwhile.
Also I use hash tables to quickly locate duplicates etc when adding files -
I hit real speed problems when I started handling large zips.
To extract given files from a zip at worst each given name must be compared
to every entry in the zip (worst case is when the files are specified in
reverse natural order) - actually I mark files already extracted so after
being located they are not compared again but it can still mean comparing
thousands of names thousands of times so I had to do something to improve
it.

Whether I use a hash or not I still need to convert to uppercase to do wild
compares of names (I optimized that function to work backwards when it can).
I had considered dropping non-NT support but unfortunately there still seems
to be much demand for it.

At the moment I am looking at trimming the Mike Lischke Unicode20 unit down
to only what I need and convert it to C/C++ (or maybe use the Delphi unit in
the dll) because the component will have similar needs this will help there
too, this will also give better utf8 support for early windows too.

Russell Peters

On Thu, Sep 30, 2010 at 8:57 AM, James Turner <james.d.h.turner@xxxxxxxxxxxx
> wrote:

> Using the full uppercase file name to create a hash might seem quicker
> at first, but unless I'm missing something, it could actually work out
> slower since the entire file name has to be converted before performing
> a character by character comparison. On the other hand, a mismatch may
> occur in the first few characters meaning that fewer character
> conversions are required by not creating a hash. Of course, I am
> assuming that the search is only performed once. If you are attempting
> to add or extract many files there may be a speed advantage by
> pre-calculating a hash but I would still opt for code simplicity since
> the compression or decompression operation is likely to be much slower
> than the search operation anyway.
>
> To give you some idea of the speed of Windows string comparison
> operations, I can sort all the files in the Windows\System32 directory
> almost instantaneously - even using the quicksort algo, that's a heck of
> a lot string comparisons. (I'm using a 3.5 year old Dell Inspiron,
> albeit with a dual-core AMD Turion CPU.)
>
> I use the following functions. I don't think they are suitable as they
> stand for your purposes but they might be helpful.
>
> -- James Turner
>
>
> const LANG_INVARIANT             = $7F;
>      LOCALE_INVARIANT           = (SORT_DEFAULT shl 16) or
> (SUBLANG_NEUTRAL    shl 10) or LANG_INVARIANT;
>      LOCALE_INVARIANT_BEFORE_XP = (SORT_DEFAULT shl 16) or
> (SUBLANG_ENGLISH_US shl 10) or LANG_ENGLISH;
>      {
> http://msdn2.microsoft.com/en-us/library/aa468951.aspx
> }
>      { The following code sample demonstrates the preferred way of
> performing a locale-independent test on Windows 2000. }
>
> {
> }
>      {
>
> CompareString(MAKELCID(MAKELANGID(LANG_ENGLISH,SUBLANG_ENGLISH_US),SORT_DEFAULT),
> ...                             }
>
> function _CompareStr(LOCALE:LCID; s1,s2:pChar; Max:integer):integer;
> { Set Max = -1 to scan to end of shorter string }
> { nil and #0 are treated equally                }
> begin
>  result := 0; if s1 = s2 then exit;
>
>       if s1 = nil then if s2^ = #0 then exit { return zero } else
> result := -1
>  else if s2 = nil then if s1^ = #0 then exit { return zero } else
> result := +1
>  else begin
>    if LOCALE = 0 then begin
>      if PlatformXP then LOCALE := LOCALE_INVARIANT
>                    else LOCALE := LOCALE_INVARIANT_BEFORE_XP;
>    end;
>
>    result := CompareString(LOCALE,NORM_IGNORECASE,s1,Max,s2,Max) - 2;
>  end;
> end;
>
> function FileNameEqual(const s1,s2:string):Boolean;
> begin
>  result := _CompareStr(0,pointer(s1),pointer(s2),-1) = 0;
> end;
>
>


-----------
To unsubscribe from this list, send an empty e-mail 
message to:
  delphizip-request@xxxxxxxxxxxxx 
and put the word unsubscribe in the subject.

Other related posts: