RE: Hash to range partition

From: "Mark W. Farnham" <mwf@xxxxxxxx>
To: <l.flatz@xxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
Date: Fri, 31 Jul 2020 12:16:06 -0400

... and ignore my previous advice. For packaged applications you need to do
things "their way" because you need to service their queries.

Lothar is prompting you to do a careful search to see whether a particular
approach might be useful.

Just remember as you do this that the vendors can upset your applecart at any
time. Now if this is a datawarehouse fed FROM a package and you control the
queries, you may still be in business to significantly improve performance.

IF you know the queries you need to service, that is an ideal starting point.

Good luck, and Lothar's test should still be done, I think.

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Lothar Flatz
Sent: Friday, July 31, 2020 3:31 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Hash to range partition

Hi Ram,

ok, that looks like Peoplesoft. I am suprised that you do not mention effdt as
search criteria. It would be the usual procedure to use id and effdt combined
in the query. ("Find the most resent row of an id.") Well, in any way if that
would be the case we would have duplicates of the id.
Do we?
You will understand that this makes a major difference, thus we have to find
out first.
If effdt is used as I know it from peoplesoft, effdt would be the column to
range partition on.
If I get your thinking right, you want to seperate more recent from older rows.
That would help your buffer cache.
The id (higher means newer) is an indirect way to seperate new from old.
Effdt is straight forward.
However, if effdt is never used in queries you must not partition on it.
Run this query to find out (you must be sysdba):

select column_name, equality_preds, range_preds, like_preds, equijoin_preds
from ( select oo.name owner, o.name, c.name column_name, u.equality_preds,
u.equijoin_preds, u.nonequijoin_preds, u.range_preds, u.like_preds,
u.null_preds, u.timestamp from sys.col_usage$ u, sys.obj$ o, sys.user$ oo,
sys.col$ c where o.obj# = u.obj# and oo.user# = o.owner# and c.obj# = u.obj#
and c.col# = u.intcol#) dba_col_usage where owner='&owner'
AND NAME=upper('&table')
ORDER BY equality_preds+ range_preds+ like_preds desc;

Regards

Lothar

Am 31.07.2020 um 00:16 schrieb Ram Raman:

All,

We have a 1.4B row table that is hash partitioned by a key (ID). The
key is increasing monotonically and seems to be coming from a
sequence. The loads happen nightly and those are the only time the
table is inserted to; during the day time only SELECTs run against the
table.

The table has 4 indexes on it, including one on the ID column. None of
the indexes are partitioned (!)

table t:
--------

ID NUMBER(16)
...
EFFDT DATE
..

Unique Index on ID column. Two other single column indexes on 2 number
columns and one single column index on a date column.

Queries that run in the day time typically access past several months
of data from the table by the ID key mostly. It is accessed via other
columns as well, but that is less than 20% of the time. There are
about 10M rows per month and we have 32 hash partitions. No INSERTS in
the day time.

Since the load is happening nightly and there are only queries in the
day time, I am considering testing out range partitioning the table
rather than hash partitioning. It seems hash partitioning is
recommended for keys based on sequences to avoid index contention
during inserts. However, with the hash based approach the queries are
scanning all partitions having to go through all 1.4B rows for a few
months of records. If I go with range partitioning (one per month) on
the ID key and if the queries access past few months of data only, I
feel we can see a substantial performance improvement with queries
only having to visit a fraction of the partitions and rows.
I am also planning on creating a locally partitioned index on the ID
column and also partitioned indexes on the other 3 columns as well.
Not sure if it will increase the load time. One option for the indexes
is that the locally partitioned index will have the new partition
created as needed and the remaining 3 indexes dropped and rebuilt
nightly. Will this be faster than the index getting updated during the
daily loads?

Can the listers share their thoughts on moving to range partition and
indexes during load.
Ram.
--

--
//www.freelists.org/webpage/oracle-l

--
//www.freelists.org/webpage/oracle-l

Follow-Ups:
- Re: Hash to range partition
  - From: Ram Raman

References:
- Hash to range partition
  - From: Ram Raman
- Re: Hash to range partition
  - From: Lothar Flatz

RE: Hash to range partition

Other related posts: