[yunqa.de] Re: DIHTMLParser: Đ entity is not recognized

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Thu, 10 May 2012 09:36:35 +0200

On 10.05.2012 06:18, Кудрявцев Дмитрий wrote:

> I use TDIHtmlParser and found the following thing: &Dstrok; entities
> are   present   in   some   files  but  they  are  not  recognized  by
> TDIHtmlParser.
> &Dstrok; corresponds to &#208; code
> http://migo.sixbit.org/more/html-entities.html

DIHtmlParser includes all HTML 4.0.1 entities.

&Dstrok; is new since HTML 5, which now calls entities "named character
references". Their full list is here:

  http://www.w3.org/TR/html5/named-character-references.html

You can add any standard or custom entity / named character reference to
DIHtmlParser by calling

  RegisterDecodingEntity('Dstrok', #$0110);

before you start parsing.

Likewise, call

    RegisterEncodingEntity(#$0110, 'Dstrok');

to encode the new entity with TDIHtmlWriterPlugin if the character
encoding is unable to represent #$0110, like ISO-8859-1, for example.

Ralf
_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: