Re: Partitioning best practices

  • From: tboss@xxxxxxxxxxxxxxxxxx
  • To: anuragdba@xxxxxxxxx
  • Date: Fri, 1 Sep 2006 15:08:29 -0400 (EDT) 

Great Tom Gordon PPT presentation on data warehousing from a very physical
implementation standpoint in oracle.  

Generally you're going to partition for one of two reasons (or perhaps
a combination of both):
- administration; i.e., i need to be able to easily "offline" data after 
a period of time
- performance: i.e., my database is massive and i need to quickly find
the data I need from millions of rows.

If its option 1, then partition based on a convenient time period field 
located in your fact table.  If its option 2 though (most warehouses), then
the answer is driven by your reporting requirements.  What field is most
commonly found and used as a "filter" by end users?  Most of the time the
answer is pretty obvious once you think about how the data is being accessed.
And a LOT of the time the answer involves your time dimension.

Examples: Payroll data is usually a variation of the pay period, HR data can be
the HR "action date" or perhpas just the date of periodic HR status snapshots,
Billing information typically by billing period, sales by a region, etc.

The key though is to build all reports to incorporate the partitioning
scheme, for this is what allows "partition pruning" and quicker access to
your data.  Also ensure that any OLAP tools properly join the dimension
that controls the partiting field and properly presents that dimension as
a filter option to the user.  If these two key items are not followed, you'll
end up hash scanning every partition in your table for every query
and performance will be a nightmare.

Good luck.

> Hi All,
> can any of you suggest best practices for partitioning in a Datawarehousing
> environment.
> for example, how to implement partitioning on fact tables>
> My table has 150 million rows in it.
> Any useful links are always welcome.
> Thanks in advance,
> Best Rgds,
> Anurag

Other related posts: