[tssg-tech] Re: <link> "Working with XML on Android"

  • From: "Rob" <robertdionne@xxxxxxxxxxx>
  • To: <tssg-tech@xxxxxxxxxxxxx>
  • Date: Tue, 28 Sep 2010 17:49:56 -0400

Regular Expression Library

 

Regarding regular expressions ---  extracting time and date information,
markup code.

 

 

There are single regular expression statements that can identify numerous
time and date formats at the following site:
http://regexlib.com/DisplayPatterns.aspx

 

There is a separate section on this site dedicated to markup code.

 

I was thinking that regular expressions may be helpful in identifying dates
and times that occur within the description portion of the events.
RegEx-identified times could be confirmed by the user as being a start or
end time, etc., and then extracted programmtically to be entered into the
local BEL calendar.  The goal of the regEx would be to identify any
reasonably formatted time and date and to convert it to match "calendar"
interface methods.  I see it working much like a spell checker, scanning
text and prompting the user when it finds mathches.    

 

I don't see this as a clean way to obtain event details, but also don't see
another approach at the moment.

 

Rob

 

 

  _____  

From: tssg-tech-bounce@xxxxxxxxxxxxx [mailto:tssg-tech-bounce@xxxxxxxxxxxxx]
On Behalf Of Beatrice W. Chaney
Sent: Tuesday, September 28, 2010 5:10 PM
To: tssg-tech@xxxxxxxxxxxxx
Subject: [tssg-tech] Re: <link> "Working with XML on Android"

 

A few explanations:
1. The http://validator.w3.org/ site is owned by the W3C  standards
organization (World Wide Web Consortium), which manages the HTML and XML
industry standards, as a service to web developers.
To validate a site just click the above link and enter the URL of the site
you would like to validate, in this case http://www.bostoneventslist.com/
and click on 'Check'. It comes up with 70 errors. 
To see what those errors mean, go to http://www.bostoneventslist.com/ and
choose the menu View -> Page Source. This displays the actual HTML generated
by the site and you can see the errors (validator gives line numbers)
Most of the validation complaints are non-conformant XHTML syntax (its
header specifies 'strict') but some are mis-matched end tags such as end
tags </script> found without a preceding <script> tag, etc...

2. Unfortunately, the fact that the RSS validates (it does) does not mean
that the content validates, as the RSS format just wraps the content with
all the < and >, etc.. converted to &lt; and &gt; (the control characters
are 'escaped') precisely to avoid  being thrown off if the content is
invalid. RSS feeds must validate. 
To get the XML format of the content fragments, we first have to run them
through a 'regular expression'
(http://en.wikipedia.org/wiki/Regular_expression) that replaces  the &lt;,
&gt;, etc... with < and > , ... again, and then try to parse these fragments
as XML.

3. To view the RSS XML, just enter the URL:
http://www.bostoneventslist.com/rss.xml in your browser.

4. The fact that the http://www.bostoneventslist.com/  does not validate is
not a direct cause of a potential issue with the content items format, as it
appears content items are generated dynamically (do not show up in the
source). So, we still need to determine whether the unescaped content items
are well-formed, and if not 'tweak' them to be well-formed. While RSS is
guaranteed to validate, I don't believe we can rely on content (that is, the
<description></description> elements) being well-formed.
TODO: write or find an 'unescape' regular expression.

Bea

Jim Cant wrote:



Hey, good news!

How did you validate it?

jim

  _____  

Date: Tue, 28 Sep 2010 11:11:08 -0700
From: jcarwellos@xxxxxxxxx
Subject: [tssg-tech] Re: <link> "Working with XML on Android"
To: tssg-tech@xxxxxxxxxxxxx


BTW, the RSS validated with no errors.

  _____  

Julie (Dingee) Carwellos
Web and IT Project Analyst, User Experience and Interaction Designer
LinkedIn <http://www.linkedin.com/in/jdingeecarwellos>  -
http://www.linkedin.com/in/jdingeecarwellos

--- On Tue, 9/28/10, Julie Carwellos  <mailto:jcarwellos@xxxxxxxxx>
<jcarwellos@xxxxxxxxx> wrote:


From: Julie Carwellos  <mailto:jcarwellos@xxxxxxxxx> <jcarwellos@xxxxxxxxx>
Subject: [tssg-tech] Re: <link> "Working with XML on Android"
To: tssg-tech@xxxxxxxxxxxxx
Date: Tuesday, September 28, 2010, 6:06 PM


Bea,

It isn't; I get a consistent 11 errors for each single-event web page (using
FireBug to validate HTML).

Additionally, each event page is styled with TABLEs, rather than floating
DIVs, so we can't use a handheld.css style sheet to load the URL into a
WebView and have only the event information display (using display:none; for
the outer columns). 

-julie

  _____  

Julie (Dingee) Carwellos
Web and IT Project Analyst, User Experience and Interaction Designer
LinkedIn <http://www.linkedin.com/in/jdingeecarwellos>  -
http://www.linkedin.com/in/jdingeecarwellos

--- On Tue, 9/28/10, Beatrice W. Chaney  <mailto:bwchaney@xxxxxxxx>
<bwchaney@xxxxxxxx> wrote:


From: Beatrice W. Chaney  <mailto:bwchaney@xxxxxxxx> <bwchaney@xxxxxxxx>
Subject: [tssg-tech] Re: <link> "Working with XML on Android"
To: tssg-tech@xxxxxxxxxxxxx
Date: Tuesday, September 28, 2010, 4:18 PM

Hi,
I suspect (but haven't verified it) that the BostonEventList data might
possibly not be well-formed.
I ran the site through the W3 validator http://validator.w3.org/ some time
ago (and again now), and it comes up with a number of errors. Having a site
be valid XHTML is a critical prerquisite to getting on top of Google's list.

If this is the case (first, need to verify that well-formedness is really
the problem) there are tidy-up utilities available, but we'd have to see
whether they are suitable for Android. 

Thanks,
Bea

Harry Henriques wrote:

Hello,

I think Bea referenced the IBM website regarding RSS parser alternatives.  I
downloaded the application from the website, and massaged the files.  I was
able to get the application to successfully create an apk and load
successfully into the Android Emulator.  The application is partially
working, but I could use some help debugging it.  The application doesn't
parse the BostonEventsList.  For some reason, it stops before displaying a
ListView.

I delivered the work I have finished to the SVN Repository in a Android
project called MessageList.

I will continue to work on it as time permits.  I've only just begun to
fight.

Regards,
Harry Henriques
Java Developer


= 

Other related posts: