[yunqa.de] Re: How to get the sourcecode of a TABLE?

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Thu, 27 Mar 2008 08:22:56 +0100

Hello Bear Xu,

>TDIHtmlParser.HtmlTag.Code can only get the source code when parsing.

Correct. The parser only stores the current piece of HTML to keep memory 
requirements to an absolute minimum.

>I mean after the parsing processed the last tag : </HTML>, 
>does it store some information in memoery like DOM?

For performance, DIHtmlParser does not generate and store DOM trees. DOM can be 
useful, but is not necessarily needed for most HTML parsing tasks.

>So I can get the code or text of any tag I want.

You can do so with DIHtmlParser even without DOM: Just collect the code of all 
tags (and text, etc.) to a string variable. You can speed up Delphi's default 
string concatenation with the ConCat... family of functions in DIUtils.pas.

>Do you provide some product that parsing the HTML into DOM then we can fetch 
>the node tree? 

DIXml creates DOM trees from both XML and HTML documents. Look at the 
DIXml_Node_Tree demo for an example -- and make sure to check "Load as HTML" if 
it is not automatically detected.

Ralf 

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: