RE: UTF character set application problem

From: "Marc Perkowitz" <mperkowitz@xxxxxxxxxxx>
To: "'Justin Cave'" <justin@xxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
Date: Tue, 28 Sep 2004 17:12:09 -0500

This is a follow up on what we found.  There was some confusion and
misunderstanding on what was actually happening.  It turns out the data =
IS
being translated and stored correctly in the Oracle UTF8 database using
additional bytes for the characters that need them.  We were not viewing
them correctly and also did not increase the column sizes to allow for =
the
additional bytes.  Once we increased the size and set the NLS_LANG
correctly, everything was fine.

Thank you Justin and Mike for hints that lead to us finding this out.

Marc Perkowitz.

-----Original Message-----
From: Justin Cave [mailto:justin@xxxxxxxxxxx]=20
Sent: Monday, September 20, 2004 3:17 AM
To: mperkowitz@xxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: UTF character set application problem

First off, this is decidedly not the way things should work.  Oracle =
should
be converting the data automatically.

If you run the Oracle character set scanner on the Western European
database, does it complain?  As Mike Vergara points out, it is possible =
to
get improperly encoded data into a database if you set the client =
NLS_LANG
the same as the database character set.  If that has happened here, it =
would
explain why Oracle is unable to do the conversion automatically.=20


Justin Cave
Distributed Database Consulting, Inc.
http://www.ddbcinc.com/askDDBC

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx =
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of mperkowitz@xxxxxxxxxxx
Sent: Friday, September 17, 2004 7:16 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: UTF character set application problem

We're having a problem with character sets.  Recently we switched our
database to UTF and now we have problems with names containing accented
characters, etc. generating errors when we are trying to insert them =3D =
into
the database.

The data originates from a database that uses Western European character
set.  We expected that UTF being a superset, there would be no problems =
=3D
with switching.  However, after a lot of testing, we found that UTF is =
not
compatible with WE characters.  If the data originates as WE, you must
either store it in a UTF database or do an explicit translation to UTF.

This is counter-intuitive to me, but it is my first experience with =3D =
using
different character sets.

The application is in Java using thin JDBC drivers and no =3D =
Oracle-specific
functions.  We created a very simple test program to prove out this =3D
finding.
We've tested this on 9iR1, 9iR2, and 8i and it works the same.

Anyone else encounter this?  Is it just my misconceptions on this in the
first place?  Or have I overlooked something?

Thanks,
Marc Perkowitz.



--
//www.freelists.org/webpage/oracle-l






--
//www.freelists.org/webpage/oracle-l

Follow-Ups:
- Re: UTF character set application problem
  - From: Matjaz Jordan

RE: UTF character set application problem

Other related posts: