Re: Building fact tables in a data warehouse
- From: Job Miller <jobmiller@xxxxxxxxx>
- To: tomdaytwo@xxxxxxxxx, oracle-l@xxxxxxxxxxxxx
- Date: Fri, 21 Dec 2007 18:24:44 -0800 (PST)
Chapter 6, p.215 of the Data Warehouse ETL Toolkit talk about the common
approach for the "Surrogate Key Pipeline" and presents a diagram basically what
you describe.
The gist of it is to ensure that all records move through successive lookups in
parallel, without going to disk, and are reading from cached lookup tables that
contain the most recent dimension values. Ideally, you don't have to use the
entire dimension, because much of the dimension contains dated data..
however, if you have late arriving data, that may invalidate your ability to do
all lookups against smaller current lookup tables vs. doing the lookup against
the full current dimensions.
it goes on to say that you can do it all in one big star join query, (with
outer joins if you are concerned some dimensional values may not match), but
that that approach isn't the most efficient..
Thomas Day <tomdaytwo@xxxxxxxxx> wrote: I'm just wondering if anyone has any
insights on how to speed this up.
Our fact tables consist of the primary keys of the involved dimension tables
plus numeric count columns.
To build our fact table, let's call it Z (as I understand the process) we
grab the primary key of the row in dimension A plus information which
we can use to find the associated row in dimension B. We go to dimension B,
grab the primary key plus information which allows us to
find the associated row in dimension C, and so on through 8 dimension tables.
This gives us one row in the fact table.
This worked OK for our ETL developers when we were dealing with 10,000 rows in
the development database.
Now we're working with a much larger source set and we're talking 100,000,000
rows. It doesn't work that well. Basically, each row in the fact table
requires full a index scan and a rowid fetch from each of the dimension tables.
Does anybody have experience or even a theoretical insight into a better way
to do this?
Thanks
---------------------------------
Never miss a thing. Make Yahoo your homepage.
- References:
- Building fact tables in a data warehouse
- From: Thomas Day
Other related posts:
- » Building fact tables in a data warehouse
- » RE: Building fact tables in a data warehouse
- » Re: Building fact tables in a data warehouse
- » Re: Building fact tables in a data warehouse
- » Re: Building fact tables in a data warehouse
- Building fact tables in a data warehouse
- From: Thomas Day