[yunqa.de] Re: DITidy to process www.163.com

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx,yunqa@xxxxxxxxxxxxx
  • Date: Sun, 30 Aug 2009 12:00:06 +0200

At 15:24 29.08.2009, Bear Xu wrote:

>How about my question in last email?
>How to fix that?

As written in my last message, I could not reproduce your findings with <form> 
and <table>.

However, I experimenting further now and was finally able to come up with a 
similar behaviour triggered by this HTML:

---- HTML begin ----
text before

<table>
<form>
<tr>

</tr>
</form>
</table>

text after
---- HTML end ----

DITidy issues the following parser errors and warnings:

  line 3 column 1 - Warning: missing <!DOCTYPE> declaration
  line 3 column 1 - Warning: plain text isn't allowed in <head> elements
  line 3 column 1 - Info: <head> previously mentioned
  line 3 column 1 - Warning: inserting implicit <body>
  line 4 column 1 - Warning: <form> isn't allowed in <table> elements
  line 3 column 1 - Info: <table> previously mentioned
  line 4 column 1 - Warning: missing </form> before <tr>
  line 7 column 1 - Warning: missing <td>
  line 8 column 1 - Error: discarding unexpected </form>
  line 3 column 1 - Warning: inserting missing 'title' element
  line 4 column 1 - Warning: <form> lacks "action" attribute
  line 3 column 1 - Warning: <table> lacks "summary" attribute
  line 4 column 1 - Warning: trimming empty <form>

In particular, this error

  line 4 column 1 - Warning: <form> isn't allowed in <table> elements

tells us that the HTML is invalid and Tidy tidyParse...() returns an error (2). 
Applications can still force Tidy to produce output by setting this option:

  tidyOptSetBool(TidyHandle, TidyForceOutput, 1);

This will then lead to the following HTML:

---- HTML begin ----
<html>
<head>
<title></title>
</head>
<body>
text before
<table>
<tr>
<td></td>
</tr>
</table>
text after
</body>
</html>
---- HTML end ----

Please know that DITidy behaves just like other Tidy implementations in this 
regard. You can test this online at http://www.htmltrim.com/, for example.

If I still miss your point, please send the precise HTML (preferably alongside 
a Delphi project for Tidy options) so I can reproduce exactly.

Ralf 

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: