[yunqa.de] Re: DiHtmlParser problems and questions

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Fri, 09 Mar 2012 09:51:23 +0100

On 08.03.2012 23:48, Max Terentiev wrote:

> But I found another page what produce same problem.
> 
> See attachment.
> 
> In AllInOne demo -> Tag Lister tags skipped between
> line 79 and 293.
> 
> Problem again in parsing javascript code. But no any
> regexp at this time !

The problem is the unterminated string in line 82:

  alert(\'Необходимо ввести запрос для поиска!\');

It contains two backslash-apostrophe-sequences "\'":

The 1st is outside a string. Hence the backslash does not escape the
apostrophe and the apostrophe starts a string.

The 2nd is inside a string where the backslash escapes the apostrophe.
Hence it does not terminate the string before it see the next unescaped
apostrophe in line 285.

[Btw, Firefox does not recognize the string either. Pressing the button
does not show the alert.]

Why does DIHtmlParser analyze JavaScipt? Because JavaScript may contain
strings like '</script>' which do NOT mark the end of the script.
Unfortunately, analysis may fail for invalid JavaScript syntax.

If you prefer DIHtmlParser not to analyze JavaScript, you can set a
default dummy script type for your TDIHtmlParser instance:

  DIHtmlParser1.DefaultContentScriptType := 'dummy'; // <-- DUMMY!!!

This will trick DIHtmlParser not to analyze JavaScript so it will parse
your HTML correctly. It will, however, fail on HTML which contains the
unescaped </script> end marker inside JavaScript strings.

Ralf
_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: