[yunqa.de] Re: Access violation calling DIPerlRegEx.Match

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Thu, 27 May 2010 22:09:03 +0200

At 02:22 27.05.2010, Jim Bretti wrote:

>I'm getting an access violation from DIPerlRegEx.Match when using the 
>following subject / match pattern:

I could reduce this to the following:

Subject: für
Pattern: \p{Zs}*\R

The problem shows for the Umlaut letter 'ü' (but not 'ä') and requires both 
'/p{Zs}*' and '\R'. At a quick glance this looks like a PCRE problem. I will 
investigate further tomorrow.

Not just as a workaround but as the recommended way of handling Umlauts with 
DIRegEx, you should set the coUtf8 compile time option and pass both subject 
and pattern as UTF-8 encoded strings.

Doing gets rid of the problem. But even more important, it guarntees that your 
Umlaut characters are handled correctly, especially that non-ASCII Unicode 
character properties can be properly determined.

Thanks for the well done demo project. I will keep you posted on any progress 
concerning this bug.

Ralf

>Subject:
>Das malerische Bild der Stadt ist durch sanfte Hügel geprägt. Sehr bekannt für 
>Oxford sind auch seine Rudermannschaften, die 
>jedes Jahr gegen die von Cambridge auf der Themse ins Rennen gehen.
>
>Match Pattern:
>(?<=[\pL\pN\pP])\p{Zs}*\R\p{Zs}*(?=[\pL\pN\pP])
>
>So the following code fails when memo1 and memo2 are loaded with the above 
>subject and match pattern:
>
>      RE :=  TDIPerlRegEx.Create(Self);
>      RE.SetSubjectStr(Memo1.Text);
>      RE.CompileMatchPattern(Trim(Memo2.Text));
>      If RE.Match > 0 then
>
>Seems like it might have something to do with umlauts in the subject (ü, ä) 
>together with the Zs property code in the match pattern.  Do I need to set any 
>compile / match options for these characters in the subject string?

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: