Re: Wrong data in 8i mview refreshed from 10g db with different characterset

  • From: Martin Berger <martin.a.berger@xxxxxxxxx>
  • To: sxmte@xxxxxxxxxxxxxxxx
  • Date: Wed, 17 Sep 2008 06:46:00 +0200


This is the correct behaviour for the database. (Well, in fact it was not 100% correct during the time, you used US7ASCII on both DBs)
Let me explain, what's going on:
First of all, there is an important component involved you did not mention yet: your client (sqlplus or whatever you use) and its NLS_* settings. (especially the character set!) 1) A database converts characters only if the client and the server does NOT have the same character set and the codepoint in both caractersets does not match.
2) US7ASCII is only defined for the least 7 bit.

This brings you to some interresting situations:
A) If all components (client, master, replication) hast the same character set configured, you can insert any character you can enter in the client, it will not be interpreted or converted, not even checked and stored in the data block. At a select also nothing happens, it just got read from the block and sent to the client. Even if the character does not match into the character set (e.g. has 8 bits instead of 'allowed' 7) it's handled this way. B) If the client and the master has a different character set, either of them (prefered the client, but if it cannot, the DB) converts the chartacters to the destinations code point. the same vice versa. Oracle seems to use the 'nearest' good character, if a correct conversion is not possible. (I have not checked this fully)

In your situation, the client and master has either the same character set, or there is a conversion. So the characters get in the master corect. As the Replication is nothing more than a client-server communication, there is a character conversion. And as US7ASCII can not gather all differeent characters, i has to search for the 'nearest'.

So my summary on this is (with knowing that I had to guess a lot and could not proove my theory on your test-system) your previous setup (all US7ASCII) was well working but wrong, and you will have to switch your replication to a characterset which can hold all characters from WE8MSWIN1252.

I hope, this helps.

best regards,

Martin Berger martin.a.berger@xxxxxxxxx

Am 17.09.2008 um 02:44 schrieb Maureen English:

We have an 8i database (US7ASCII characterset) with materialized views
being refreshed from a 10g database (was US7ASCII, just changed to WE8MSWIN1252).

Prior to converting the 10g database to WE8MSWIN1252, the data was displayed
the same in both databases.

For example:

Langue et littérature français

resulted from a select statement in both databases when they were both US7ASCII. Now, the result is the same in the WE8MSWIN1252 database, but after I refreshed the materialized view, the result from the 8i database, which is still US7ASCII,
is different:

Langue et litterature francais

The codepoints should have been the same in both databases, but for some reason,
the refresh changed things.

The table in the 10g master (WE8MSWIN1252) database shows:

TEST> select dump(STVCIPC_DESC) from stvcipc where stvcipc_code='550100';

Typ=1 Len=30: 76,97,110,103,117,101,32,101,116,32,108,105,116,116,233,114,97,116,117,114,101,32,102,114,97,110,231,97,105,115

but the table in the 8i (US7ASCII) database shows different codepoints:

RPTT> select dump(STVCIPC_DESC) from stvcipc where stvcipc_code='550100';

Typ=1 Len=30: 76,97,110,103,117,101,32,101,116,32,108,105,116,116,101,114,97,116,117,114,101,32,102,114,97,110,99,97,105,115 ^ ^

If I select the codepoints from the 8i database, the characters show up correctly....

RPTT> select chr(101),chr(99),chr(233),chr(231) from dual;

- - - -
e c é ç

Any ideas why the refresh of the materialized view wouldn't still contain the correct codepoints?

- Maureen


Other related posts: