RE: Replacement of US7ASCII character set in 11g?

  • From: "Goulet, Richard" <Richard.Goulet@xxxxxxxxxxx>
  • To: <janine@xxxxxxxxxx>, "oracle-l L" <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 11 Jan 2010 14:51:10 -0500

Janine,
 
    I regret to say that I have seen that before.  Your database is
US7ASCII but the client that inserted the data was WE8ISO8859P1 or some
such.  The ? is Oracle doing the best it can with the data.  Regrettably
without knowing what was suppose to have been entered you'll never
figure it out without dumping the numeric code that it is.  I believe
you might be able to do that with the ASCIISTR function.
 

Dick Goulet 
Senior Oracle DBA/NA Team Lead 
PAREXEL International 

 


________________________________

From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Janine Sisk
Sent: Monday, January 11, 2010 2:37 PM
To: oracle-l L
Subject: Re: Replacement of US7ASCII character set in 11g?


Thanks to everyone who replied.... 


I ran CSSCAN on the original 8.1.7 database and, of course, ran into
trouble right away.  The conversion from US7ASCII to WE8MSWIN1252 is
lossy in a number of places.  This does not surprise me terribly;  Jared
mentioned that you can put "invalid" data into a database of type
US7ASCII and I'm pretty sure that all of the programmers who have worked
on this site over the years have just assumed that if the database
didn't choke on it, then it was ok.


What concerns me is that CSSCAN reports that converting to UTF8 will
have the exact same lossy conversions.  The two error files are
literally identical except for the value of TOCHAR.  I thought that UTF8
was the mother of all character sets, so where do I go from here?


As an example, one of the first errors reported looks like this when I
do a SELECT in sqlplus:


Lic. en medios de comunicaci&Atilde;?3n


I will be digging further into this, with The Google and all, but if
anyone has any light to shed, please do!


janine


On Jan 7, 2010, at 3:07 PM, David Mann wrote:


        When I had clients worried about character set conversions I
usually ran Character Set Scanner utility CSSCAN on a copy of the
database to check for differences. Here is the reference in the 10g
documentation, assuming it is still available in 11g but don't have a
link handy. 
        
        
http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch12scann
er.htm
        
        -- 
        Dave Mann
        www.brainio.us
        www.ba6.us - Database Stuff - http://www.ba6.us/rss.xml
        


Other related posts: