Re: bytes vs chars

  • From: Hans Forbrich <fuzzy.graybeard@xxxxxxxxx>
  • To: "Zelli, Brian" <Brian.Zelli@xxxxxxxxxxxxxxx>, "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 11 Mar 2016 10:14:27 -0700

No.  You are interpreting that parameter incorrectly.

http://docs.oracle.com/database/121/REFRN/GUID-221B0A5E-A17A-4CBC-8309-3A79508466F9.htm#REFRN10124

http://docs.oracle.com/database/121/NLSPG/ch2charset.htm#NLSPG170

Length semantics = Byte says "when I define a column as "col-name VARCHAR2(x)", the x is taken in bytes, regardless of character set. So "MyCOL VARCHAR2(20)" in byte semantics could be five to 20 characters (maximum) of storage allocated. If each character was unicode and used 4 bytes, all you could store is 5 characters.

You still want to find out the actual character set: http://docs.oracle.com/database/121/REFRN/GUID-3BCC0324-8FEC-409F-8472-74A72FDE310F.htm#REFRN30159

/Hans

On 11/03/2016 9:55 AM, Zelli, Brian wrote:

So if nls_length_semantics is set to byte, can I assume the 1 char = 1 byte 
rule?


Brian



-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Hans Forbrich
Sent: Friday, March 11, 2016 11:48 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: bytes vs chars

I think the sentiment is correct, but there is a minor correction to the
wording:

Unicode is an attempt to get all different character sets into one
superset, and is multi-byte in nature.   The AL32UTF8 encoding for
Unicode allows a character to be represented in the fewest required of 1, 2, 3 
or 4 bytes, based on the Quick Link 'Code Charts' at http://unicode.org/

The 1 character = 1 byte group are often known as 'single byte character sets' 
or 'single byte encoding'.  These include ASCII and various ISO
8859 sets.  A handy reference is at
http://docs.oracle.com/database/121/NLSPG/ch2charset.htm#NLSPG166

Therefore, I think the statement should be corrected to

    "If you're using a single-byte characterset then 1character = 1 byte.
But if you're using a multibyte Unicode characterset then a character can be coded 
on several bytes."

/Hans

On 11/03/2016 8:50 AM, Ahmed Aangour wrote:
Hi,

If you're using a unicode characterset then 1character = 1 byte. But
if you're using a multibyte characterset then a character can be coded
on several bytes.
You can check the character set of the database by querying
nls_database_parameters.


--
//www.freelists.org/webpage/oracle-l




This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.

--
//www.freelists.org/webpage/oracle-l


Other related posts: