Thanks to everyone who answered. I have been too busy before to get back quickly. Jonathan was spot on when he said it was a web application. Sorry about not being open with table names; restrictions with the project. Here is the plan with autotrace: Execution Plan ---------------------------------------------------------- Plan hash value: 2021814760 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 206 | 4723 (2)| 00:00:57 | | 1 | HASH UNIQUE | | 1 | 206 | 4723 (2)| 00:00:57 | |* 2 | HASH JOIN | | 1 | 206 | 4722 (2)| 00:00:57 | | 3 | TABLE ACCESS BY INDEX ROWID| XA | 1 | 103 | 4 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | I1394550123266 | 1 | | 3 (0)| 00:00:01 | | 5 | TABLE ACCESS FULL | XA | 888K| 87M| 4707 (1)| 00:00:57 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("XA"="XA") 4 - access("X"='1234567891011') The stats are bit old, the table has close to million rows now. The testing is paused temporarily due to a different reason. I am unable to find the answer for how big the table will grow yet. There are two indexes currently on the table, a single column index on a number column (not in this query) and a two column index on column 'X' and a third column. Jonathan's idea of building index on both columns in the query to skip the table looks like a brilliant idea to me especially given that the query is executed so many times. The columns all have the same value upto first 10 characters. Here is an example of a couple of rows: siteAlias:G5TdJiXR4fRpJM2yvcdSy9JDZzsRcrjBsp8hBTfxThLYvT04KnkW!1121631113!1415190289822 siteAlias:RRcQJgXSblWT0MJt9BSC13RMTPy7JrhGpRGr7ZbrlWlFnqBgfcn0!1121631113!1415190290627 siteAlias:flBxJhXSJmGtcZk68T3ZvnZfMJqBWW970vvQ6qMXRWKnGQlB9v3R!1121631113!1415190290628 Can you please explain why the first 7 characters being same would mess up the cardinality estimates. Is the column not indexed fully to the whole length. On Wed, Mar 11, 2015 at 4:01 AM, Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx > wrote: > > Given the nature of your query, the first thing to do is check the > execution plan. If the query can be made efficient by indexing then you > will need two indexes on the table, one on the X column to drive the > subquery and one on the XA column to be the target of the values identified > by the subquery. > > As it stands it's possible (if the XA index already exists) that the > tablescan is the outer select and the optimizer is using the subquery as a > filter subquery (or maybe semi-join) through an index on XA. > > If the table is only 1M rows, and you're adding 20 to 40 rows in a few > seconds, then you seem to be growing the table at the rate of around 1M > rows per week (ballpark) - so you must also have a process that is deleting > those rows; I'd guess that this table is probably keeping track of > web-connections in some way and that rapid access to this table is > important to the web-users; that being the case, and given the "near > uniqueness" of the column I think I would index it; if you're modified SQL > is close to true I'd index (x, xa) so that the subquery could drive through > an index range scan and avoid the table. The one special thing I would > consider is whether or not to create a reverse key index: when I say this, > I'm guessing that the table is a fairly fixed size with balanced inserts > and deletes, and I'm guessing that the X column might be a constructed > value that starts with a time or sequence-based component; if the latter > (particularly) is not true and the data values arrive in a completely > random order then reverse key won't be of any benefit; if mu guesses happen > to be right you'll be minimising the impact of bug in the handling of > leaf-block splits. ( > https://jonathanlewis.wordpress.com/2009/09/19/index-itls/ ) > > The fact that the column averages 85 to 90 characters is a little > undesirable - but not a disaster. On a related note, though, if many of > the values look similar across the first 6 or 7 characters the optimizer > could get really messed up with its cardinality estimates and produce > massive over-estimates of the expected row counts. > > > Regards > Jonathan Lewis > http://jonathanlewis.wordpress.com > @jloracle > > >> >> On 10.03.2015 17:52, Cee Pee wrote: >> >> >> It is a varchar2 column with max length of 150. I checked and the >> lengths of strings stored currently vary from 85 to 90 characters >> distributed evenly across the values in about 1M rows. The sql in modified >> form with table name and col name changed: >> >> *SELECT DISTINCT x FROM xa WHERE xa IN (SELECT xa FROM xa WHERE >> x IN ('123456789101112') ) * >> >> table name and col name in the predicates are same; the subquery selects >> from the same user.table as the main query. >> The same SQL is getting executed several hundreds of times during the >> peak hours. >> >> >>> >>> On 10.03.2015 12:22, Cee Pee wrote: >>> >>>> List >>>> >>>> I see a table being hit and queried a few thousand times over peak >>>> hours. The table has more than million rows and grows while operational. I >>>> did not measure the growth fully, but based on monitoring for several >>>> minutes, it is adding about 20 to 40 rows in few seconds, when i was >>>> checking via sqlplus. I dont see this growth most of the times though. It >>>> is a small table with 6 columns and the app in a web application. There is >>>> also another sql that is run constantly that accesses the rows based on one >>>> of the char columns which is 150 characters, but there is no index on the >>>> column which is causing a tablescan. Is this column a bad candidate for >>>> indexing. Any rule of thumb length for char columns above which adding >>>> index is considered moot? v11.2. >>>> >>>> CP. >>>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> --- >>> Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. >>> http://www.avast.com >>> >>> -- >>> //www.freelists.org/webpage/oracle-l >>> >>> >>> >> >> >> -- >> >> >> >> >> >> >> >> ------------------------------ >> [image: Avast logo] <http://www.avast.com/> >> >> Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. >> www.avast.com >> >> > > > -- > > > > > > > > ------------------------------ > [image: Avast logo] <http://www.avast.com/> > > Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. > www.avast.com > >