Exploring NLS: How to decipher output from NLSSORT
- From: "Charles Schultz" <sacrophyte@xxxxxxxxx>
- To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
- Date: Thu, 1 Jan 2009 21:04:11 -0600
Good day, List,
I am attempting to understand the linguistic nuances of how Oracle
implements various character sets (territory and language). I have found a
number of useful white papers, metalink notes (and of course the
Globalization online Docs are good), but I have stumbled upon an obstacle of
sorts (tongue in cheek).
Metalink note 13882.1 (*Linguistic Sorting of Data in Oracle7 and Oracle8*)
talks about major and minor values, and also mentions the WB# (used
internally). I am assuming this information is still valid in 9i and 10g, as
there are no other newer metalink documents that cover these specific
details (that I have found). So, for instance, capital A grave (À) is listed
with a major value of 20, minor 5 (WB# = 300). I also noticed that the
online documentation has a small sample glyph
table<http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch5lingsort.htm#g1010269>with
slightly different values (and despite the little NOTE below the
table,
I get the same results in a non-unicode character set as a unicode one). How
do I expose that in sql? I assume NLSSORT gives me the proper numbers, but I
am not sure how to reverse engineer the output. The following example is
from 10gR2 AMERICAN_AMERICA.WE8ISO8859P1:
SQL > alter session set nls_comp = LINGUISTIC;
Session altered.
SQL > alter session set NLS_SORT=generic_m;
Session altered.
SQL > select nlssort('A') from dual;
NLSSORT('A')
--------------
01EA0000020006
SQL > select nlssort('À') from dual;
NLSSORT('À')
----------------
01EA0000020D0006
I set the nls_comp and nls_sort parameters because the default (binary
sorting) only gave me basic ascii values for the sort numbers, which is not
what I want. I am not overly familiar with either parameter yet, so it is
possible I could be using a "better" or easier nls_sort than generic_m.
I am assuming the baseletter is wrapped up in 01EA000002, since that is the
same for other A diacritics as well.
If anyone knows of a better way to get the major/minor sort values, please
let me know. The reason why I am pursuing this little nuance is for the
purpose of working up a lab that I demo to others, and I personally learn
much better when I have concrete examples I can put my hands on.
TIA
--
Charles Schultz
Other related posts:
- » Exploring NLS: How to decipher output from NLSSORT - Charles Schultz