[yunqa.de] How to TIDY HTML with unicode via DITidy?

  • From: Bear Xu <bear.xy@xxxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Sun, 8 Feb 2009 09:40:50 +0800

Hi,

1.
I do not know the encode type of the the html file(may be windows 1251 or
gb2312, or utf8 or others), e.g. used in a crawler,
How to use DiTidy to parse the html file with unicode? it is possible?

2.
How to parse the WideString Source code, and return the clear and repaired
html source code :

function TidyHtml(HTML_Source:WideString) : WideString;
begin
  ???
end;


thank you very much

Bear

Other related posts: