Hi Ralf, thanks for the update. On the utf8 recommendation, are you saying I need to use coTUF8 if my subject/match string contains ordinal values > 127? I'm using the unicode / utf8 options only when necessary since I seem to be getting better performance when I don't go through the utf8 encoding and character counting. Jim Delphi Inspiration wrote: _______________________________________________ Delphi Inspiration mailing list yunqa@xxxxxxxxxxxxx //www.freelists.org/list/yunqaAt 22:09 27.05.2010, Delphi Inspiration wrote:At 02:22 27.05.2010, Jim Bretti wrote:I'm getting an access violation from DIPerlRegEx.Match when using the following subject / match pattern:I could reduce this to the following: Subject: für Pattern: \p{Zs}*\R The problem shows for the Umlaut letter 'ü' (but not 'ä') and requires both '/p{Zs}*' and '\R'. At a quick glance this looks like a PCRE problem. I will investigate further tomorrow.Code analysis revealed that this is indeed a problem in PCRE. By mistake, it always tests Unicode Character Properties (the '\p{Zs}' part in your pattern) against UTF-8, even if coUtf8 is not set. The 'ü' Umlaut this therefore interpreted as the beginning of an UTF-8 sequence and results in an invalid character which finally leads to the access violation. The workaround I suggested yesterday is safe and the recommended way to handle Umlaut characters with DIRegEx: Set the coUtf8 compile time option and encode both pattern and subject as UTF-8 before passing them to TDIRegEx methods. I have forwarded the problem to the PCRE mailing list and will update DIRegEx when a fix becomes available. Ralf _______________________________________________ Delphi Inspiration mailing list yunqa@xxxxxxxxxxxxx //www.freelists.org/list/yunqa |