ok, that looks like Peoplesoft. I am suprised that you do not mention effdt as search criteria. It would be the usual procedure to use id and effdt combined in the query. ("Find the most resent row of an id.")
Well, in any way if that would be the case we would have duplicates of the id.
You will understand that this makes a major difference, thus we have to find out first.
If effdt is used as I know it from peoplesoft, effdt would be the column to range partition on.
If I get your thinking right, you want to seperate more recent from older rows. That would help your buffer cache.
The id (higher means newer) is an indirect way to seperate new from old. Effdt is straight forward.
However, if effdt is never used in queries you must not partition on it.
Run this query to find out (you must be sysdba):
select column_name, equality_preds, range_preds, like_preds, equijoin_preds
select oo.name owner,
from sys.col_usage$ u,
where o.obj# = u.obj#
and oo.user# = o.owner#
and c.obj# = u.obj#
and c.col# = u.intcol#) dba_col_usage
ORDER BY equality_preds+ range_preds+ like_preds desc;
Am 31.07.2020 um 00:16 schrieb Ram Raman:
We have a 1.4B row table that is hash partitioned by a key (ID). The key is increasing monotonically and seems to be coming from a sequence. The loads happen nightly and those are the only time the table is inserted to; during the day time only SELECTs run against the table.
The table has 4 indexes on it, including one on the ID column. None of the indexes are partitioned (!)
Unique Index on ID column. Two other single column indexes on 2 number columns and one single column index on a date column.
Queries that run in the day time typically access past several months of data from the table by the ID key mostly. It is accessed via other columns as well, but that is less than 20% of the time. There are about 10M rows per month and we have 32 hash partitions. No INSERTS in the day time.
Since the load is happening nightly and there are only queries in the day time, I am considering testing out range partitioning the table rather than hash partitioning. It seems hash partitioning is recommended for keys based on sequences to avoid index contention during inserts. However, with the hash based approach the queries are scanning all partitions having to go through all 1.4B rows for a few months of records. If I go with range partitioning (one per month) on the ID key and if the queries access past few months of data only, I feel we can see a substantial performance improvement with queries only having to visit a fraction of the partitions and rows.
I am also planning on creating a locally partitioned index on the ID column and also partitioned indexes on the other 3 columns as well. Not sure if it will increase the load time. One option for the indexes is that the locally partitioned index will have the new partition created as needed and the remaining 3 indexes dropped and rebuilt nightly. Will this be faster than the index getting updated during the daily loads?
Can the listers share their thoughts on moving to range partition and indexes during load.