[yunqa.de] Re: DITidy to process www.163.com

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx,yunqa@xxxxxxxxxxxxx
  • Date: Mon, 31 Aug 2009 18:39:32 +0200

At 13:27 30.08.2009, Bear Xu wrote:

>Thank you for your message.
>Check your delphi email box, i sent you the whole project several days ago, 
>i run on d2009 and vista.

My appologies, I missed the attachment. As I look at it now, I see that I hit 
the right spot in my previous message: Tidy reports an error when parsing your 
HTML. And errors, unlike warnings, will suppress all output unless forced.

Here is how you can change your project to force Tidy to generate output even 
if there are parsing errors:

  { Parse the HTML. }
  if tidyParseBuffer(TidyHandle, @InBuf) = 2 then
    begin
      ShowMessage('HTML contains errors');
      tidyOptBetBool (TidyHandle, TidyForceOutput, 1);
    end;

Ralf

>On Sun, Aug 30, 2009 at 6:00 PM, Delphi Inspiration <delphi@xxxxxxxx> wrote:
>At 15:24 29.08.2009, Bear Xu wrote:
>
>>How about my question in last email?
>>How to fix that?
>
>As written in my last message, I could not reproduce your findings with <form> 
>and <table>.
>
>However, I experimenting further now and was finally able to come up with a 
>similar behaviour triggered by this HTML:
>
>---- HTML begin ----
>text before
>
><table>
><form>
><tr>
>
></tr>
></form>
></table>
>
>text after
>---- HTML end ----
>
>DITidy issues the following parser errors and warnings:
>
> line 3 column 1 - Warning: missing <!DOCTYPE> declaration
> line 3 column 1 - Warning: plain text isn't allowed in <head> elements
> line 3 column 1 - Info: <head> previously mentioned
> line 3 column 1 - Warning: inserting implicit <body>
> line 4 column 1 - Warning: <form> isn't allowed in <table> elements
> line 3 column 1 - Info: <table> previously mentioned
> line 4 column 1 - Warning: missing </form> before <tr>
> line 7 column 1 - Warning: missing <td>
> line 8 column 1 - Error: discarding unexpected </form>
> line 3 column 1 - Warning: inserting missing 'title' element
> line 4 column 1 - Warning: <form> lacks "action" attribute
> line 3 column 1 - Warning: <table> lacks "summary" attribute
> line 4 column 1 - Warning: trimming empty <form>
>
>In particular, this error
>
> line 4 column 1 - Warning: <form> isn't allowed in <table> elements
>
>tells us that the HTML is invalid and Tidy tidyParse...() returns an error 
>(2). Applications can still force Tidy to produce output by setting this 
>option:
>
> tidyOptSetBool(TidyHandle, TidyForceOutput, 1);
>
>This will then lead to the following HTML:
>
>---- HTML begin ----
><html>
><head>
><title></title>
></head>
><body>
>text before
><table>
><tr>
><td></td>
></tr>
></table>
>text after
></body>
></html>
>---- HTML end ----
>
>Please know that DITidy behaves just like other Tidy implementations in this 
>regard. You can test this online at http://www.htmltrim.com/, for example.
>
>If I still miss your point, please send the precise HTML (preferably alongside 
>a Delphi project for Tidy options) so I can reproduce exactly.
>
>Ralf

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: