I missed the original conversation, I think, but:
There is a pl/sql procedure in my book (CBO Fundamentals) that calculates the end-point value for a character string. The algorithm starts by taking the first 15 characters (padding with nulls if necessary), reading it as a hex number of 30 digits, converting to decimal, then rounding that decimal to 15 significant figures. (which is approximately the same as the value you would get by using just the first six bytes of the string and zero padding them up to 15 bytes - hence your 6-character observation).
The use of endpoint_actual_value has changed with versions, at present I think you only get the endpoint_actual_value (or rather, the first 32 bytes thereof) if there are two entries in endpoint_value where the truncated value are the same even though the actual values are different.
The entire set of scripts for the book, including the relevant function is in the ZIP file. The script is called char_value.sql in the subdirectory for chapter six
Because of the rounding to 15 s.f., you cannot reverse the stored endpoint_value back to the original text value - but you could probably get the first 5 characters.
The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.html
Cost Based Oracle: Fundamentals http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
I would like to revisit this thread briefly. I believe Jerry was referring back to Lex's response: http://www.freelists.org/archives/oracle-l/09-2004/msg00243.html
He also mentioned hexstr from AskTom: http://asktom.oracle.com/pls/ask/f?p=4950:8:5671668926260613952::NO::F4950_P8_DISPLAYID,F4950_P8_CRITERIA:707586567563
Woflgang speculated that Oracle only encodes the first 6 characters, and will populate actual_value for character fields "when it needs to".
Given all this, it seems that there is no way to reverse-engineer a bucket on a character column consistently. Is that still true in 10gR2? In my own testing, this seems to be the case, especially if the column is larger than 6 characters. Even when it is not (when the column is less than 6 characters), the "encoding" process still makes decoding rather difficult. I used the hexstr function but it is too easy to get control characters in the stream (even using substr).
Is there any other way to analyze the bucket information for histograms on character fields? Thanks,
-- Charles Schultz