[tssg-tech] Re: <link> "Working with XML on Android"

  • From: "Beatrice W. Chaney" <bwchaney@xxxxxxxx>
  • To: tssg-tech@xxxxxxxxxxxxx
  • Date: Tue, 28 Sep 2010 17:09:59 -0400

A few explanations:
1. The http://validator.w3.org/ site is owned by the W3C standards organization (World Wide Web Consortium), which manages the HTML and XML industry standards, as a service to web developers. To validate a site just click the above link and enter the URL of the site you would like to validate, in this case http://www.bostoneventslist.com/ and click on 'Check'. It comes up with 70 errors. To see what those errors mean, go to http://www.bostoneventslist.com/ and choose the menu View -> Page Source. This displays the actual HTML generated by the site and you can see the errors (validator gives line numbers) Most of the validation complaints are non-conformant XHTML syntax (its header specifies 'strict') but some are mis-matched end tags such as end tags </script> found without a preceding <script> tag, etc...

2. Unfortunately, the fact that the RSS validates (it does) does not mean that the content validates, as the RSS format just wraps the content with all the < and >, etc.. converted to &lt; and &gt; (the control characters are 'escaped') precisely to avoid being thrown off if the content is invalid. RSS feeds must validate. To get the XML format of the content fragments, we first have to run them through a 'regular expression' (http://en.wikipedia.org/wiki/Regular_expression) that replaces the &lt;, &gt;, etc... with < and > , ... again, and then try to parse these fragments as XML.

3. To view the RSS XML, just enter the URL: http://www.bostoneventslist.com/rss.xml in your browser.

4. The fact that the http://www.bostoneventslist.com/ does not validate is not a direct cause of a potential issue with the content items format, as it appears content items are generated dynamically (do not show up in the source). So, we still need to determine whether the unescaped content items are well-formed, and if not 'tweak' them to be well-formed. While RSS is guaranteed to validate, I don't believe we can rely on content (that is, the <description></description> elements) being well-formed.
TODO: write or find an 'unescape' regular expression.

Bea

Jim Cant wrote:

Hey, good news!

How did you validate it?

jim

------------------------------------------------------------------------
Date: Tue, 28 Sep 2010 11:11:08 -0700
From: jcarwellos@xxxxxxxxx
Subject: [tssg-tech] Re: <link> "Working with XML on Android"
To: tssg-tech@xxxxxxxxxxxxx

BTW, the RSS validated with no errors.

------------------------------------------------------------------------
Julie (Dingee) Carwellos
Web and IT Project Analyst, User Experience and Interaction Designer
LinkedIn <http://www.linkedin.com/in/jdingeecarwellos> - http://www.linkedin.com/in/jdingeecarwellos

--- On Tue, 9/28/10, Julie Carwellos <jcarwellos@xxxxxxxxx> wrote:


    From: Julie Carwellos <jcarwellos@xxxxxxxxx>
    Subject: [tssg-tech] Re: <link> "Working with XML on Android"
    To: tssg-tech@xxxxxxxxxxxxx
    Date: Tuesday, September 28, 2010, 6:06 PM

    Bea,

    It isn't; I get a consistent 11 errors for each single-event web
    page (using FireBug to validate HTML).

    Additionally, each event page is styled with TABLEs, rather than
    floating DIVs, so we can't use a handheld.css style sheet to load
    the URL into a WebView and have only the event information display
    (using display:none; for the outer columns).

    -julie

    ------------------------------------------------------------------------
    Julie (Dingee) Carwellos
    Web and IT Project Analyst, User Experience and Interaction Designer
    LinkedIn <http://www.linkedin.com/in/jdingeecarwellos> -
    http://www.linkedin.com/in/jdingeecarwellos

    --- On Tue, 9/28/10, Beatrice W. Chaney <bwchaney@xxxxxxxx> wrote:


        From: Beatrice W. Chaney <bwchaney@xxxxxxxx>
        Subject: [tssg-tech] Re: <link> "Working with XML on Android"
        To: tssg-tech@xxxxxxxxxxxxx
        Date: Tuesday, September 28, 2010, 4:18 PM

        Hi,
        I suspect (but haven't verified it) that the BostonEventList
        data might possibly not be well-formed.
        I ran the site through the W3 validator
        http://validator.w3.org/ some time ago (and again now), and it
        comes up with a number of errors. Having a site be valid XHTML
        is a critical prerquisite to getting on top of Google's list.

        If this is the case (first, need to verify that
        well-formedness is really the problem) there are tidy-up
        utilities available, but we'd have to see whether they are
        suitable for Android.

        Thanks,
        Bea

        Harry Henriques wrote:

            Hello,

            I think Bea referenced the IBM website regarding RSS
            parser alternatives.  I downloaded the application from
            the website, and massaged the files.  I was able to get
            the application to successfully create an apk and load
            successfully into the Android Emulator.  The application
            is partially working, but I could use some help debugging
it. The application doesn't parse the BostonEventsList. For some reason, it stops before displaying a ListView.

            I delivered the work I have finished to the SVN Repository
            in a Android project called MessageList.

            I will continue to work on it as time permits.  I've only
            just begun to fight.

            Regards,
            Harry Henriques
            Java Developer



=

Other related posts: