[yunqa.de] Re: Change charset in HTML-file using DIHtmlCharSetPlugin

  • From: Delphi Inspiration <delphi@xxxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Sat, 08 Dec 2012 13:44:04 +0100

On 07.12.2012 11:17, Oleg wrote:

> The destination file will not change encoding of first line (xml
> tag). The first line will remain:
> 
> <?xml version="1.0" encoding="Windows-1252"?>
> 
> but must be:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> 
> What is wrong?

The first line is not an HTML tag (ptHtmlTag) but an XML Processing
Instruction (ptXmlPI). DIHtmlParser does not parse XML PIs into a tag
structure so it makes no sense to modify its CONTENT attribute.

Instead, the original data must be replaced so it can be written by the
TDIHtmlWriterPlugin. A simple StringReplace does the job. Here is the
modified OnCharSetChange procedure:

procedure TForm1.DIHtmlCharSetPlugin1CharSetChange(
  const Sender: TDIHtmlCharSetPlugin;
  const CharSet: string;
  var ReadMethods: TDIUnicodeReadMethods;
  var AllowChange: Boolean);
var
  e, s: UnicodeString;
begin
  e := UnicodeEncodings[
    Integer(cbEncoding.Items.Objects[cbEncoding.ItemIndex])].MimeName;

  case Sender.HtmlParser.PieceType of
    ptHtmlTag:
      Sender.HtmlParser.HtmlTag.ValueOfNumber[ATTRIB_CONTENT_ID] :=
        'text/html;charset=' + e;
    ptXmlPI:
      begin
        s := Sender.HtmlParser.DataAsStrW;
        s := StringReplace(s, CharSet, e, []);
        Sender.HtmlParser.ClearData;
        Sender.HtmlParser.AddStrW(s);
      end;
  end;
end;

If you feel that this solution is a workaround, I have to agree. I
designed the OnCharSetChange event when HTML had no XML PIs and it was
sufficient to to modify the HTML tag attribute. Unfortunately, it is not
suitable to modify XML PIs.

A better solution would be to turn OnCharSetChange's CharSet parameter
into a var so the plugin can modify the original source as desired.
Unfortunately, this is not backwards compatible.

Hence I propose to introduce a new TDIHtmlCharSetPlugin2. Can you think
of anything else I should take care of?

Ralf
_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: