Try this for the search: \n\n+([A-Z].*?[a-zA-Z0-9])\n+ and this for the replacement: \n----------\n\f\n$1\n\n Jamal -----Original Message-----From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Homme, James
Sent: Monday, August 09, 2010 7:15 AM To: programmingblind@xxxxxxxxxxxxx Subject: RE: Regular Expression Question: How To Search For Section Titles Hi,I've narrowed down what I want to match a little better now. I'd love to not have a block for this stuff. Sorry people. Anyway, here goes.
The first thing I want to match is two new lines. Next is a line of text that starts with a capital letter. The line can have anything else on it. The line should end with a letter or a number.Then I want to replace it with what I have found, but I want to put the EdSharp section break string before it and an extra new line after it.
A section break in EdSharp looks like this. \n----------\n\f\n Thanks. Jim Jim Homme, Usability Services, Phone: 412-544-1810. Skype: jim.hommeInternal recipients, Read my accessibility blog. Discuss accessibility here. Accessibility Wiki: Breaking news and accessibility advice
-----Original Message-----From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Dave
Sent: Saturday, August 07, 2010 3:17 PM To: programmingblind@xxxxxxxxxxxxx Subject: Re: Regular Expression Question: How To Search For Section TitlesBtw, your reg ex depending on which engine you're using, could just be ^[A-Z].$
(the period in your reg ex could be interpreted to match any character besides a new line). Thus, this reg ex just matches a line that starts with an upper case character followed by any number of characters.
If you instead wanted only alphanumeric characters, then you would probably try the pattern:
^[A-Z][A-Za-z0-9]*$(the asterik follows the group and is called a quantifier. The asterik simply says that the pattern should have 0 or more of this set). I'm not sure what the ".*" you had in your pattern would have resulted in.
It would then match: A1234 Ca1a or Z Hth. On 8/7/10, Kerneels Roos <kerneels@xxxxxxxxx> wrote:
Going out on a limb here regarding exact command line options, but youcould use the 'sed` command, the stream editor to do this at the command
prompt:
$ sed s/(^[A-Z].*[A-Za-z0-9]$)/\f\1/g in.txt > out.txt or: $ sed s/\(^[A-Z].*[A-Za-z0-9]$\)/\f\1/g in.txt > out.txt if `(' needs to be escaped. Not sure if '\f` would insert page breaks either -- might have to access the direct ASCII value, but anyways. The `s' in the sed regular expression pattern instructs sed that you want to do a substitution. On Fri, Aug 6, 2010 at 7:48 PM, Homme, James <james.homme@xxxxxxxxxxxx>wrote:Hi, Maybe EdSharp uses .Net regular expressions, and maybe they are different from Perl regular expressions. I was trying to use $1 to capture and replace, but it was literally inserting $1. I was trying to put \f before $1 in the replacement expression. I'm attempting to find what it thinks might be titles and put a page break before them so that I can simply look through the document and spot check to see if the lines are really titles rather than read the whole thousand pages and find them all by hand. Thanks. Jim Jim Homme, Usability Services, Phone: 412-544-1810. Skype: jim.homme Internal recipients, Read my accessibility blog. Discuss accessibility here. Accessibility Wiki: Breaking news and accessibility advice -----Original Message----- From: programmingblind-bounce@xxxxxxxxxxxxx [mailto: programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jim Bauer Sent: Friday, August 06, 2010 1:25 PM To: programmingblind@xxxxxxxxxxxxx Subject: Re: Regular Expression Question: How To Search For Section Titles It does, just not inside a character class. If you wanted to match something from one of several character classes using `|', you would do something like: ---------- [a-z]|[A-Z]|[...] ---------- But you can just spell out everything you want to match in a single character class, so I don't see that as particularly useful. On Fri, 6 Aug 2010 12:48:12 -0400, Homme, James wrote: > Hi, > I'm misusing the vertical bar. I thought it created an or condition. > > Jim > > Jim Homme, > Usability Services, > Phone: 412-544-1810. Skype: jim.homme Internal recipients, Read my > accessibility blog. Discuss accessibility here. Accessibility Wiki: Breaking news and accessibility advice > > > -----Original Message----- > From: programmingblind-bounce@xxxxxxxxxxxxx [mailto: programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jim Bauer > Sent: Friday, August 06, 2010 10:36 AM > To: programmingblind@xxxxxxxxxxxxx > Subject: Re: Regular Expression Question: How To Search For Section Titles > > You're including `|' in your last character class, not matching > uppercase letters or lowercase letters or digits. This means something like `This is a > test|' will match, which, of course, is fine if that's what you're intending. :) > > ---------- > ^[A-Z].+[A-Za-z0-9]$ > ---------- > > On Fri, 6 Aug 2010 09:42:55 -0400, Homme, James wrote: > > Hi, > > How would you construct a regular expression that looks for the > > first letter of any line in upper case followed by the rest of the line as long as it ends with a letter or number? Would it be something like this? > > ^[A-Z].*[A-Z|a-z|1-9]$ > > > > Thanks. > > > > Jim > > > > Jim Homme, > > Usability Services, > > Phone: 412-544-1810. Skype: jim.homme Internal recipients, Read > > my accessibility blog< http://mysites.highmark.com/personal/lidikki/Blog/default.aspx>. Discuss accessibility here<
http://collaborate.highmark.com/COP/technical/accessibility/default.aspx>.
Accessibility Wiki: Breaking news and accessibility advice< http://collaborate.highmark.com/COP/technical/accessibility/Accessibi lity%20Wiki/Forms/AllPages.aspx > > > > > > > ________________________________ > > This e-mail and any attachments to it are confidential and are > > intended solely for use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this e-mail without the author's prior permission. The views expressed in this e-mail message do not necessarily represent the views of Highmark Inc., its subsidiaries, or affiliates. > > __________ > View the list's information and change your settings at > //www.freelists.org/list/programmingblind > > __________ > View the list's information and change your settings at > //www.freelists.org/list/programmingblind __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind-- Kerneels Roos Cell/SMS: +27 (0)82 309 1998 Skype: cornelis.roos The early bird may get the worm, but the second mouse gets the cheese!
__________View the list's information and change your settings at //www.freelists.org/list/programmingblind
__________View the list's information and change your settings at //www.freelists.org/list/programmingblind
__________View the list's information and change your settings at //www.freelists.org/list/programmingblind