[yunqa.de] Re: Access violation calling DIPerlRegEx.Match

From: Jim Bretti <jim@xxxxxxxxxx>
To: yunqa@xxxxxxxxxxxxx
Date: Fri, 28 May 2010 12:24:09 -0400

Hi Ralf, thanks for the update.

On the utf8 recommendation, are you saying I need to use coTUF8 if my subject/match string contains ordinal values > 127? I'm using the unicode / utf8 options only when necessary since I seem to be getting better performance when I don't go through the utf8 encoding and character counting.

Jim

Delphi Inspiration wrote:

At 22:09 27.05.2010, Delphi Inspiration wrote:

At 02:22 27.05.2010, Jim Bretti wrote:

I'm getting an access violation from DIPerlRegEx.Match when using the following subject / match pattern:

I could reduce this to the following:

Subject: für
Pattern: \p{Zs}*\R

The problem shows for the Umlaut letter 'ü' (but not 'ä') and requires both '/p{Zs}*' and '\R'. At a quick glance this looks like a PCRE problem. I will investigate further tomorrow.

Code analysis revealed that this is indeed a problem in PCRE. By mistake, it always tests Unicode Character Properties (the '\p{Zs}' part in your pattern) against UTF-8, even if coUtf8 is not set. The 'ü' Umlaut this therefore interpreted as the beginning of an UTF-8 sequence and results in an invalid character which finally leads to the access violation.

The workaround I suggested yesterday is safe and the recommended way to handle Umlaut characters with DIRegEx: Set the coUtf8 compile time option and encode both pattern and subject as UTF-8 before passing them to TDIRegEx methods.

I have forwarded the problem to the PCRE mailing list and will update DIRegEx when a fix becomes available.

Ralf 

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa

_______________________________________________ Delphi Inspiration mailing list yunqa@xxxxxxxxxxxxx //www.freelists.org/list/yunqa

Follow-Ups:
- [yunqa.de] Re: Access violation calling DIPerlRegEx.Match
  - From: Delphi Inspiration

References:
- [yunqa.de] Access violation calling DIPerlRegEx.Match
  - From: Jim Bretti
- [yunqa.de] Re: Access violation calling DIPerlRegEx.Match
  - From: Delphi Inspiration
- [yunqa.de] Re: Access violation calling DIPerlRegEx.Match
  - From: Delphi Inspiration

[yunqa.de] Re: Access violation calling DIPerlRegEx.Match

Other related posts: