[yunqa.de] Re: DIHtmlParser and Entities

  • From: "Mike Dixon" <mike@xxxxxxxxxxx>
  • To: <yunqa@xxxxxxxxxxxxx>
  • Date: Fri, 10 Jul 2009 09:39:42 -0500

I really appreciate your help.

Yes, I added:
// DIHWriter is a TDIHtmlWriterPlugin
DIHWriter.Writer.WriteMethods := Write_US_ASCII; 

I have it partially working now.

Part of my problem is that I'm using an IE DHTML Editing control for WYSIWYG
editing, and a code control. When I switch to "code mode", I take the source
from the WYSIWYG control and run it through my routine that converts the
HTML tags and attributes back to lowercase (using a TDIHtmlCasePlugin).

The real problem now is that the WYSIWYG control converts &copy; to a single
copyright character, so when I pass in the following to the parser:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />
</head>
<body>
C <!-- This is a single character typed with alt-0169 -->
</body>
</html>

And I get out:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=UTF-8"/>
</head>
<body>
? <!-- This is a single character typed with alt-0169 -->
</body>
</html>

The character on the third to the last line is a question mark, instead of
&copy;

If I DON'T include a content-type, it comes out of the parser/writer just
fine. I'm guessing this is probably "correct", however there are a lot of
HTML authors who want their code preserved just like they typed it.

Can I force the parser to ignore the charset-UTF-8 line?

> -----Original Message-----
> From: yunqa-bounce@xxxxxxxxxxxxx 
> [mailto:yunqa-bounce@xxxxxxxxxxxxx] On Behalf Of Delphi Inspiration
> Sent: Friday, July 10, 2009 8:11 AM
> To: yunqa@xxxxxxxxxxxxx; yunqa@xxxxxxxxxxxxx
> Subject: [yunqa.de] Re: DIHtmlParser and Entities
> 
> At 15:58 10.07.2009, Mike Dixon wrote:
> 
> >I Added the line below, and now I get a question instead of 
> any valid 
> >character or entity.
> 
> Is this the "line below" you added?
> 
>   DIHtmlWriterPlugin1.Writer.WriteMethods := Write_US_ASCII;
> 
> You are saying that you now get a question. Who is asking the 
> question? What is the question? In other words: You are 
> giving too little details for me to help.
> 
> Please send your code as a compilable project and I will be 
> glad to review it for you!
> 
> Ralf 
> 
> _______________________________________________
> Delphi Inspiration mailing list
> yunqa@xxxxxxxxxxxxx
> //www.freelists.org/list/yunqa
> 
> 
> 

_______________________________________________
Delphi Inspiration mailing list
yunqa@xxxxxxxxxxxxx
//www.freelists.org/list/yunqa



Other related posts: