[riscosweb] Browsers not responding to charset=ISO-8859-2

  • From: Russell Hafter - Lists <rh_lists@xxxxxxxx>
  • To: riscosweb@xxxxxxxxxxxxx
  • Date: Tue, 14 Nov 2006 10:13:57 +0000 (GMT)

In article <45590C0B.6010607@xxxxxxxxxxxx>, Matthew
Somerville <riscos@xxxxxxxxxxxx> wrote:

> > Orpheus internet tells me that the server is upposed to
> > work in exactly the way I suggested - the  ISO-8859-1
> > header is meant to be a default, but allowing itself to
> > be overriden by my charset=ISO-8859-2 declaration.

> No, that is quite definitely wrong. As you can read at
> http://www.w3.org/TR/html401/charset.html#h-5.2.2 :

> --8<--------------------
> To sum up, conforming user agents must observe the
> following priorities when determining a document's
> character encoding (from highest priority to lowest):

> * An HTTP "charset" parameter in a "Content-Type" field.

> * A META declaration with "http-equiv" set to
> "Content-Type" and a value set for "charset".

> * The charset attribute set on an element that
> designates an external resource.

> --8<--------------------

Yes, this is certainly what the document says, and so it
appears that we are stuck with.

But it seems totally the wrong way round to me.

If the encoding that the server sends cannot be overridden
by a different encoding in a particular page of a website,
how on earth is one meant to construct a website that may
require any number of different charsets?

Suppose I were to want to construct pages in Japanese for
the Japanese market? Is the W3C really saying that I cannot
do that because the charset sent out by the server that
hosts the website, and over which I have no control, will
automatically override any attempt on my part to set a
Japanese charset?

Or do I have to find a separate hosting company just for
Japanese pages, who will promise me that their servers will
never send out any charset statements?

And for the future, if I need to move the hosting of this
website, how am I supposed to find out which hosting firms
do have servers that send out a charset statement and which
do not?

Having a look at another part of the page, there is
something about transcoding and "Accept-Charset" HTTP
request header, but I cannot really follow this.

It all seems utterly crazy to me[1]. What is the logic
anyway in a server sending out charset definitions? Surely
that ought to be down to a) the document writer and b) the
person actually reading the document.

This may explain, though, the number of documents I receieve
where the '£' is replaced by '?' and I have to reset the
document charset in Firefox manually by selecting ISO-8859-1
from the far eastern charset in which it is displayed, even
though the dopcument header declares the correct charset. I
am talking about large UK companies who presumably have
their own servers here!

[1] The more I find out about W3C standards the more
depressed I get. I have always got my webpages verified and
I have no complaint at all with the concept of proper
standards, just that to me they seem to enjoy implementing
them the wrong way round, as here. Equally, the insistense
in the xhtml standards that all tags be lower case I find
completely incomprehensible. To me, since the body text is
normally lower case, the tags ought all to be upper case.
But that is another matter.

-- 
Russell Hafter
Mailing Lists
rh.lists@xxxxxxxxxxx or rh_lists@xxxxxxxx
(Literally) on the edge of the Lake District National Park

Other related posts: