Re: Indexing a char column

From: Lothar Flatz <l.flatz@xxxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Wed, 11 Mar 2015 08:54:54 +0100

Hi,

I think the index is inefficient because you do by no means need 80character to uniquely identify a million rows. We can assume to need 12digits for reasonable key playing it very safe and the 80-90 columnstill has an overhead of factor 7.This is also the reason why I am less concerned about coalitions using aFBI based on ora_hash, checkum, etc.The size of the index key does matter in many respects. Mainly therewill be less entries per index block as compared to a key that isshorter. As a consequence the index is larger, will trigger more blocksplits, more maintenance overhead. It will have a likely lesser hit rateper block in buffer cache on average, will therefore not cache as well etc.I was once involved in a project where they where mimicking ISAM bypadding key values to the max length. It was quite a nightmare how slowthat stuff worked.

My gut feeling would still be that you might want to just go ahead andtry it, but it is possible by the rate of data growth that will build amaintenance nightmare.I think a viable plan B might be to index onORA_HASH(xa,any_prime_number_with_enough_digits) and use VPD to changeyour query to:

/SELECT DISTINCT x FROM xa WHERE xa IN (SELECT xa FROM xa WHERE x IN('123456789101112') *and*/*/ORA_HASH(xa,any_prime_number_with_enough_digits) =/**/ORA_HASH(/*/*/'123456789101112'/**,any_prime_number_with_enough_digits)*) /;


Thanks

Lothar

On 10.03.2015 22:29, Cee Pee wrote:

To answer Seth and Lothar, the column is almost unique though nounique constraint is declared. There is only one duplicate value amongthe million rows, could be due to testing. I think that would make theindex highly selective. Can you please explain what you mean by theindex will not be efficient.

I do not know yet how big the table can grow or what will be thedelete or insert frequency, we are testing.

On Tue, Mar 10, 2015 at 12:16 PM, Lothar Flatz <l.flatz@xxxxxxxxxx<mailto:l.flatz@xxxxxxxxxx>> wrote:


    Hi,

    I think you could eventually use an Index here, but i am certainly
    not very happy with it considering the update frequency you are
    talking about. I have seen Indexes based on columns that long.
    There are not very efficient.
    I would consider clever ways to decrease the length of the column,
    e.g. use the ORA_HASH function. Certainly the consequence is
    rewriting the query though.
     If the data is not only growing but also gets deleted eventually
    you might even consider hash cluster. I normally do not recommend
    it, but that could be an exceptional case.
    If teh table is constantly growing forget about hash cluster.

    On other more conservative way would be using hash partitioning
    and compressing the table. Few columns is supporting compression.
    but your long column looks bad in that respect.
    Is the column unique?
    I have to admit the choice is not easy and I am not very happy
    with either suggestion.
    I am afraid you have to try.

    Thanks

    Lothar



    On 10.03.2015 17:52, Cee Pee wrote:


    It is a varchar2 column with max length of 150. I checked and the
    lengths of strings stored currently vary from 85 to 90 characters
    distributed evenly across the values in about 1M rows. The sql in
    modified form with table name and col name changed:

    /SELECT DISTINCT x FROM xa WHERE xa IN (SELECT xa FROM  xa WHERE
    x IN ('123456789101112') ) /

    table name and col name in the predicates are same; the subquery
    selects from the same user.table as the main query.
    The same SQL is getting executed several hundreds of times during
    the peak hours.



    On Tue, Mar 10, 2015 at 7:44 AM, Lothar Flatz <l.flatz@xxxxxxxxxx
    <mailto:l.flatz@xxxxxxxxxx>> wrote:

        Hi CP,

        I would rather not index a 150 char long  column. Is the
        actual length really 150?
        what is the comparison operator? I think there is a chance it
        is not equal, but substr, like or similar.

        Thanks

        Lothar


        On 10.03.2015 12:22, Cee Pee wrote:

            List

            I see a table being hit and queried a few thousand times
            over peak hours. The table has more than million rows and
            grows while operational. I did not measure the growth
            fully, but based on monitoring for several minutes, it is
            adding about 20 to 40 rows in few seconds, when i was
            checking via sqlplus. I dont see this growth most of the
            times though. It is a small table with 6 columns and the
            app in a web application. There is also another sql that
            is run constantly that accesses the rows based on one of
            the char columns which is 150 characters, but there is no
            index on the column which is causing a tablescan. Is this
            column a bad candidate for indexing. Any rule of thumb
            length for char columns above which adding index is
            considered moot? v11.2.

            CP.

--





        ---
        Diese E-Mail wurde von Avast Antivirus-Software auf Viren
        geprüft.
        http://www.avast.com

        --
        //www.freelists.org/webpage/oracle-l

--






    ------------------------------------------------------------------------
    Avast logo <http://www.avast.com/>    

    Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
    www.avast.com <http://www.avast.com/>



--






---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
http://www.avast.com

Follow-Ups:
- RE: Indexing a char column
  - From: Jonathan Lewis

References:
- Indexing a char column
  - From: Cee Pee
- Re: Indexing a char column
  - From: Lothar Flatz
- Re: Indexing a char column
  - From: Cee Pee
- Re: Indexing a char column
  - From: Lothar Flatz
- Re: Indexing a char column
  - From: Cee Pee

Re: Indexing a char column

Other related posts: