Re: Looking for table design input

  • From: Toon Koppelaars <toon.koppelaars@xxxxxxxxxxx>
  • To: genegurevich@xxxxxxxxxxxx
  • Date: Fri, 27 Mar 2009 22:15:11 +0100

[had a few offline clarifications with Gene]


I'd consider changing the three code columns into one column, just to save
space (by minimizing the row-length) on your huge table.
Like this:

create table CODE_COMBO_TABLE  -- Holds all valid combinations of code1/2/3.
(ID        number
,CODE1        ...
,CODE2        ...
,CODE3        ...
,primary key (ID)
,unique (CODE1,CODE2,CODE3)
)
/


YEAR/MONTH/HALF would be required to remain in the main table, as they drive
your partitioning scheme.
ID also, as that is your main filter criteria in queries on this table.

create table COUNTS_TABLE
(ID          number        not null FK references ....
,YEAR        number(4,0)    not null
,MONTH       number(2,0)    not null check(MONTH between 1 and 12)
,HALF        number(1,0)    not null check(HALF in (1,2))
,CODE_COMBO_ID number        not null references CODE_COMBO_TABLE(ID)
,COUNT#      number
,primary key (YEAR,MONTH,HALF,ID,CODE_COMBO)
)
/

Note the order in the PK...

Then partition the COUNTS_TABLE such that every partition holds only one
combination of Year, Month, Half.
You would want the PK index to be a local index, this is why I start it with
the three-columns that are driving the partioning scheme. Queries with
'WHERE ID=:bindvalue' should perform without any problems, since each local
index effectively starts with the ID-column.

The required (hash) join to get the actual CODE1/2/3 values for every query
now, is a bet I'm willing to take.
But I do not know how critical your requirements are.

Btw.
Your proposed setup

ID -  Code1 - Code2 - Code3 - Count1 - Count2 - ... Count15

Would require massive updates every half month (shifting 14 of the 15 values
to the right).
I'd not be in favor of such a design.

Toon


On Fri, Mar 27, 2009 at 4:13 PM, <genegurevich@xxxxxxxxxxxx> wrote:

> Hello everybody:
>
> I am looking for some input on a table design (oracle 10.2.0.3). Here is
> what I know:
>
> - the table will probably have several hundreds of millions of rows (I am
> looking for a better estimate from my customers)
> - it will store the counts per ID pre date per code for each of the ID.
> Something like
>
> ID - Year - Month - Day - Code1 - Code2 - Code3 - Count
>
> - We will have a primary key on all the columns except for the count
> - The data will be stored for the 15 half-month periods (the Day will be
> either 1 or 15)
> - The data will be selected based on the ID and the users will need to get
> all the data
> - I will partition the table by Year/Month/Day to make purging easier.
>
> What I wonder is whether the following design (which I do not like very
> much) may give a better performance:
>
> ID -  Code1 - Code2 - Code3 - Count1 - Count2 - ... Count15
>
> Because it may potentially return less rows and therefore require less IO.
>
> Any thoughts?
>
> thank you
>
> Gene Gurevich
>
>
> Please consider the environment before printing this email.
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>


-- 
Toon Koppelaars
RuleGen BV
+31-615907269
Toon.Koppelaars@xxxxxxxxxxx
www.RuleGen.com
thehelsinkideclaration.blogspot.com

(co)Author: "Applied Mathematics for Database Professionals"
www.RuleGen.com/pls/apex/f?p=14265:13

Other related posts: