Re: Of Character Sets, Performance and Storage...

  • From: Tanel Põder <tanel.poder.003@xxxxxxx>
  • To: "oracle-l" <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 19 Jan 2005 15:54:04 +0200


> So... The obvious question is, How much more efficient and better
> performing are they in actual practice? Also, I'm thinking US7ASCII and
> WE8ISO8859P1 are both single-byte and possess equivalent storage
> requirements and performance characteristics even though one is 7 bit
> and the other is 8 bit. Is this true?

I think you should test this out for your environment to get a correct

With AL16UTF16 encodinc scheme, the problem is that every character takes 2
bytes, it doesn't matter whether the character would actually be described
in one byte only. This means larger strings, more storage and more CPU to
compare them.

However, when we are talking about variable-width encoding schemes sets such
UTF-8, then some characters will take 1 byte, some (which values are > 127)
take 2 bytes. Storagewise it's fine, but string comparision takes more CPU
since for pretty much every byte in a raw representation of a string, we
have to check whether it belongs to current character or is actually a start
or new character already. INSTR and SUBSTR operations on variable witdh
encoded strings are thus much slower.

(Remember, UTF isn't a character set, but encoding scheme for a (unicode)
character set),

So, fixed-width multibyte might better when majority of your characters need
multibyte storage, since with fixed width multibyte 2 bytes can represent
65k characters, but with variable width the 2 bytes can only represent 16k
(2 bits are reserved to storage metainformation).

Back to your case,  my guess is that you're ok with your current
charactersets, but you might want to convert your database to use the same
character set that your client will use for inserting the records, that way
you'd avoid character set conversion on two task level and save some cpu



Other related posts: