RE: Regular Expression Question: How To Search For Section Titles

  • From: "Homme, James" <james.homme@xxxxxxxxxxxx>
  • To: "programmingblind@xxxxxxxxxxxxx" <programmingblind@xxxxxxxxxxxxx>
  • Date: Mon, 9 Aug 2010 14:40:04 -0400

Hi Jamal,
Thanks.

Jim

Jim Homme,
Usability Services,
Phone: 412-544-1810. Skype: jim.homme
Internal recipients,  Read my accessibility blog. Discuss accessibility here. 
Accessibility Wiki: Breaking news and accessibility advice


-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx 
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jamal Mazrui
Sent: Monday, August 09, 2010 1:46 PM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: Regular Expression Question: How To Search For Section Titles

Try this for the search:

\n\n+([A-Z].*?[a-zA-Z0-9])\n+

and this for the replacement:

\n----------\n\f\n$1\n\n

Jamal

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx 
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Homme, James
Sent: Monday, August 09, 2010 7:15 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: Regular Expression Question: How To Search For Section Titles

Hi,
I've narrowed down what I want to match a little better now. I'd love to 
not have a block for this stuff. Sorry people. Anyway, here goes.

The first thing I want to match is two new lines.
Next is a line of text that starts with a capital letter.
The line can have anything else on it.
The line should end with a letter or a number.

Then I want to replace it with what I have found, but I want to put the 
EdSharp section break string before it and an extra new line after it.

A section break in EdSharp looks like this. \n----------\n\f\n

Thanks.

Jim



Jim Homme,
Usability Services,
Phone: 412-544-1810. Skype: jim.homme
Internal recipients,  Read my accessibility blog. Discuss accessibility 
here. Accessibility Wiki: Breaking news and accessibility advice


-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx 
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Dave
Sent: Saturday, August 07, 2010 3:17 PM
To: programmingblind@xxxxxxxxxxxxx
Subject: Re: Regular Expression Question: How To Search For Section Titles

Btw, your reg ex depending on which engine you're using, could just be 
^[A-Z].$

(the period in your reg ex could be interpreted to match any character 
besides a new line).  Thus, this reg ex just matches a line that starts 
with an upper case character followed by any number of characters.

If you instead wanted only alphanumeric characters, then you would 
probably try the pattern:
^[A-Z][A-Za-z0-9]*$

(the asterik follows the group and is called a quantifier.  The asterik 
simply says that the pattern should have 0 or more of this set).  I'm not 
sure what the ".*" you had in your pattern would have resulted in.

It would then match:
A1234

Ca1a

or

Z

Hth.

On 8/7/10, Kerneels Roos <kerneels@xxxxxxxxx> wrote:
> Going out on a limb here regarding exact command line options, but you
> could use the 'sed` command, the stream editor to do this at the command 
prompt:
>
>
>
>
> $ sed s/(^[A-Z].*[A-Za-z0-9]$)/\f\1/g in.txt > out.txt
>
> or:
>
> $ sed s/\(^[A-Z].*[A-Za-z0-9]$\)/\f\1/g in.txt > out.txt
>
> if `(' needs to be escaped. Not sure if '\f` would insert page breaks
> either
> -- might have to access the direct ASCII value, but anyways.
>
> The `s' in the sed regular expression pattern instructs sed that you
> want to do a substitution.
>
> On Fri, Aug 6, 2010 at 7:48 PM, Homme, James
> <james.homme@xxxxxxxxxxxx>wrote:
>
>> Hi,
>> Maybe EdSharp uses .Net regular expressions, and maybe they are
>> different from Perl regular expressions. I was trying to use $1 to
>> capture and replace, but it was literally inserting $1. I was trying
>> to put \f before $1 in the replacement expression. I'm attempting to
>> find what it thinks might be titles and put a page break before them
>> so that I can simply look through the document and spot check to see
>> if the lines are really titles rather than read the whole thousand
>> pages and find them all by hand.
>>
>> Thanks.
>>
>> Jim
>>
>> Jim Homme,
>> Usability Services,
>> Phone: 412-544-1810. Skype: jim.homme Internal recipients,  Read my
>> accessibility blog. Discuss accessibility here. Accessibility Wiki:
>> Breaking news and accessibility advice
>>
>>
>> -----Original Message-----
>> From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:
>> programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jim Bauer
>> Sent: Friday, August 06, 2010 1:25 PM
>> To: programmingblind@xxxxxxxxxxxxx
>> Subject: Re: Regular Expression Question: How To Search For Section
>> Titles
>>
>> It does, just not inside a character class. If you wanted to match
>> something from one of several character classes using `|', you would
>> do something
>> like:
>> ----------
>> [a-z]|[A-Z]|[...]
>> ----------
>> But you can just spell out everything you want to match in a single
>> character class, so I don't see that as particularly useful.
>>
>> On Fri, 6 Aug 2010 12:48:12 -0400, Homme, James wrote:
>> > Hi,
>> > I'm misusing the vertical bar. I thought it created an or condition.
>> >
>> > Jim
>> >
>> > Jim Homme,
>> > Usability Services,
>> > Phone: 412-544-1810. Skype: jim.homme Internal recipients,  Read my
>> > accessibility blog. Discuss accessibility
>> here. Accessibility Wiki: Breaking news and accessibility advice
>> >
>> >
>> > -----Original Message-----
>> > From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:
>> programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Jim Bauer
>> > Sent: Friday, August 06, 2010 10:36 AM
>> > To: programmingblind@xxxxxxxxxxxxx
>> > Subject: Re: Regular Expression Question: How To Search For Section
>> Titles
>> >
>> > You're including `|' in your last character class, not matching
>> > uppercase
>> letters or lowercase letters or digits. This means something like
>> `This is a
>> > test|' will match, which, of course, is fine if that's what you're
>> intending. :)
>> >
>> > ----------
>> > ^[A-Z].+[A-Za-z0-9]$
>> > ----------
>> >
>> > On Fri, 6 Aug 2010 09:42:55 -0400, Homme, James wrote:
>> > > Hi,
>> > > How would you construct a regular expression that looks for the
>> > > first
>> letter of any line in upper case followed by the rest of the line as
>> long as it ends with a letter or number?  Would it be something like
>> this?
>> > > ^[A-Z].*[A-Z|a-z|1-9]$
>> > >
>> > > Thanks.
>> > >
>> > > Jim
>> > >
>> > > Jim Homme,
>> > > Usability Services,
>> > > Phone: 412-544-1810. Skype: jim.homme Internal recipients,  Read
>> > > my accessibility blog<
>> http://mysites.highmark.com/personal/lidikki/Blog/default.aspx>.
>> Discuss accessibility here<
>> 
http://collaborate.highmark.com/COP/technical/accessibility/default.aspx>.
>> Accessibility Wiki: Breaking news and accessibility advice<
>> http://collaborate.highmark.com/COP/technical/accessibility/Accessibi
>> lity%20Wiki/Forms/AllPages.aspx
>> >
>> > >
>> > >
>> > > ________________________________
>> > > This e-mail and any attachments to it are confidential and are
>> > > intended
>> solely for use of the individual or entity to whom they are
>> addressed. If you have received this e-mail in error, please notify
>> the sender immediately and then delete it. If you are not the
>> intended recipient, you must not keep, use, disclose, copy or
>> distribute this e-mail without the author's prior permission. The
>> views expressed in this e-mail message do not necessarily represent
>> the views of Highmark Inc., its subsidiaries, or affiliates.
>> >
>> > __________
>> > View the list's information and change your settings at
>> > //www.freelists.org/list/programmingblind
>> >
>> > __________
>> > View the list's information and change your settings at
>> > //www.freelists.org/list/programmingblind
>>
>> __________
>> View the list's information and change your settings at
>> //www.freelists.org/list/programmingblind
>>
>> __________
>> View the list's information and change your settings at
>> //www.freelists.org/list/programmingblind
>>
>>
>
>
> --
> Kerneels Roos
> Cell/SMS: +27 (0)82 309 1998
> Skype: cornelis.roos
>
> The early bird may get the worm, but the second mouse gets the cheese!
>
__________
View the list's information and change your settings at 
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at 
//www.freelists.org/list/programmingblind


__________
View the list's information and change your settings at 
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

Other related posts: