RE: Lost appendix: Space Estimations for Schema Objects

  • From: "Leslie Tierstein" <Leslie.Tierstein@xxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 16 Jun 2004 16:06:54 -0400

The issue is one of scale and history:

The estimated size of my extra-large objects can run very large: The largest
fact table at a customer site that has been operational for two years is 48
gigs. And that customer is collecting weekly data. (It is data for a retail
supplier; weekly data is summarized by week, by store, by item; Daily data
still summarizes transactions, but to the day-level.) Our new customer wants
to collect daily data.

We're only doing this level of detail, BTW, for the main fact and summary
tables. The remaining tables are relatively small and a less detailed
estimate works just fine.

Historically, the previous guesstimate at Oracle sizing was adapted (badly)
from a SQLServer model, and it vastly overestimated the storage required.
The client got really peeved when they overbought on hardware, and I'm
trying to avoid a repetition of that. One cause of the bad estimate could
have been that the client overestimated the weekly download volume: It's
based on some percent of items being sold in some percentage of stores in
some particular week. However, this being a national retailer, the
distribution of items varies widely, based on both local (community)
preferences and seasons.  (One of the clients manufactures lawn care
products, so you can imagine ...) If that's the case, I may get blamed for a
bad estimate, but I can point to other sources.  However, if my computations
are way off, and the client spends more $$ on hardware than required,
leaving them fewer $$ to spend on software and consulting/services (i.e.,
me), than that's not good.  And if the estimates are way under, not only
might they run out of storage space, but their system might be underpowered,
since we base the # of processors to put in the server on the amount of data
that needs to be processed and reported on.  (Can't do it on # of
transactions, since, this being a DW/DSS, there are very few actual users.)

Leslie

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Daniel Fink
Sent: Wednesday, June 16, 2004 3:00 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Lost appendix: Space Estimations for Schema Objects


Leslie,

I have one question, Why is such a level of detail needed?

In the past, for each schema or database, we defined 4 sizes of objects
(along the lines of autoallocate boundaries).

Small - 128k (or the value of db_block_size *
db_file_multiblock_read_count). This is also our baseline multiple.
Medium - 1m
Large - 64m
Extra-Large - 256m

We then reviewed the tables/indexes and "estimated" their size and put them
in the appropriate tablespace (lmt uniform size).
Me - So, how big do you *think* this table will be?
Them - Well...
Me - More than 256 megs?
Them - Oh, not that large.
Me - Less than 50 megs?
Them - Hmm..that sounds about right.
Me - Okay, that's a Large.
Them - But, that means we are not using 14 megs.
Me - That's okay. This gives us pre-allocated space if our estimate is
wrong. And 14 megs is not a big deal. After all, how much
memory do you have in your PDA? Kinds of puts it in perspective.

I've used this approach very successfully in the past 5 years and find that
my space wastage is less than 10% (even less on larger
databases).

The first time I used this approach, it took a lot of work with the
development staff to set aside the fine-grained sizing process
they were used to. Using this approach, we created 2 data warehouses of
about 750g each (back in 8.0.4 time). The only space-related
issue in the 9 months I was there was a load process that tried..and
tried...and tried to process a bad input file and filled up the
normally small error_log table. The next release tracked the number of
errors on a particular file and stopped trying to load it
after a threshold had been met. Almost all of the other databases that were
supported by members in my group had an average of 1
space related failure per week. Of course, I also worked myself out of the
contract...(Mogens, are you listening?)

Regards,
Daniel Fink

Leslie Tierstein wrote:
> Applying the formulas as documented in Oracle 8, and comparing the results
> to an actual Oracle 8i database (a data warehouse; I'm interested only in
8
> fact tables ranging in size from 10 million to 500 million rows; the 9i
> database is for a new client, and is the reason for this exercise) yields
> the following:
>
> - Table estimates for non-partitioned tables are within 10% (uniformly
lower
> for the estimate than the actual) of the actual
> - But the table estimates don't take into account partitioned tables,
which
> are farther off
> - B-Tree index estimates were modified to reflect the changed size of the
> ROWID; most are still within 10% deviation from the actual
> - We're still looking at the bitmap indexes (determining cardinality) and
> the local partitioned indexes
>
> More problematic is actually determining how many rows are going to be in
> each fact table -- both at startup and at 6 month intervals.
>
> Unfortunately, we're not going to 10g any time soon, so I can't use the EM
> facility Lex mentioned.
>
> Leslie

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: