Not wrong, but remember what JL wrote about HOW you fill rows.
IF, for example, someone decided as a design issue to pin down a primary key
and then fill in all the attributes later (which I am NOT advocating but which
I have seen), then unless you do the Hakan factor setting dance you might take
up a lot more space.
So if in your applications there is anything less than all the columns getting
a value at least as long as their final value at insert time, you might be
subject to nasty row migration and space wasting.
If you take your existing test and look at the way your applications insert
rows and update columns, you could do that as a series of inserts and updates
(filling in the columns that are not “born” with values at insert in a wave or
waves to simulate the reality of your application.)
Now that is oriented only toward a before the fact size projection. Usually I
use that sort of simulation to help urge folks toward having the initial insert
be filled out including using dummy plug values when the attribute is logically
unknowable until later. This is particularly useful for operational stage or
status columns and date of occurrence columns, as is using the Tim Gorman
“Scaling to Infinity” approach (the fastest update or delete is an insert, read
his or someone else’s papers on that approach).
Folks quite often underestimate the value of having a row “born” on insert at
its full length. If some columns rarely get a non-born-on value, that’s a
slippery slope and you might need histograms if a significant number of initial
“plug” values pile up.
If nearly all your blocks have 16 or 17 rows (for example) that will be stable
information to the CBO and the lack of row migration means your block scanning
might take place as efficiently as possible.
I prefer to stack the deck in favor of the CBO easily getting very good plans
unless it puts you on a maintenance treadmill that exceeds the value.
Also, I didn’t see anything about compression in the thread (I may have missed
it).
Good luck,
mwf
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Pap
Sent: Monday, February 22, 2021 3:12 AM
To: jack@xxxxxxxxxxxx
Cc: Mark W. Farnham; Jonathan Lewis; Oracle L
Subject: Re: Size estimation
Thank you All.I have created a sample table with all not null values and
created all the columns to the max possible length. And inserted the same rows
~100K times i.e. total number of rows is 100K and now i see the size of the
full table now is ~53MB and so for 100million rows i am estimating it as ~51GB.
For index the size on those two columns it's coming as 4MB,so for 100million
rows it will be ~4GB. So table and index combined size for a day's worth of
data holding ~100million rows is 55GB. Hope it's correct.
And I was thinking the actual size may not be this much ,so to get the near
real size of the table/index the only way is to have the rows inserted as it
will be when actual business data gets inserted into the table. Please correct
me if I am wrong.
Regards
Pap
On Mon, Feb 22, 2021 at 6:19 AM Jack van Zanen <jack@xxxxxxxxxxxx> wrote:
Or
Simply create the table and indexes in a test environment and add 100K dummy
records, record the size and multiply to scale.
no need for maths and also fills indexes so you will know the sizes for those
as well :-)
Jack van Zanen
<https://docs.google.com/uc?id=0BwovDucFT1fXaEREVHNWRWZyNjg&export=download>
-------------------------
This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient. If you are not the intended recipient, please be
aware that any disclosure, copying, distribution or use of this e-mail or any
attachment is prohibited. If you have received this e-mail in error, please
contact the sender and delete all copies.
Thank you for your cooperation
On Mon, Feb 22, 2021 at 9:35 AM Mark W. Farnham <mwf@xxxxxxxx> wrote:
What JL wrote, and you did only ask about the size for the table.
BUT, since you marked a primary key that is almost certainly supported by an
index and you may have additional indexes, so you’ll need to tack space for
indexes on to get total storage requirements.
mwf
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Jonathan Lewis
Sent: Sunday, February 21, 2021 1:29 PM
To: Oracle L
Subject: Re: Size estimation
The number(15,0) will take at most 9 bytes
The number(13,0) will take at most 8 bytes each
So your estimate should be 496 - 13 - 14 - 14 = 455
Then you need to add one byte per column to get 471.
Then you have to allow for block size, which means 8,066 bytes available from
an 8KB block size with pctfree 0, initrans 2 (default) and ASSM
Max rows = trunc(8066 / 471) = 17 rows per block,
At 100M rows that's 5,882,353 data blocks.
If you create the table using a large extent size *8MB min) you get 1 bitmap
block for every 128 blocks allocated so your block requirement goes up by
128/127,
so a total of 5,928,671 blocks. Round that up to the nearest 64MB (assumed
extent size) - 5,931,008 blocks = 45.25GB.
So even with several errors on the way you got pretty close to the "right"
answer.
Realistically, though, you're unlikely to fill all those 40 and 50 character
columns, and unless you're very carefull with setting pctfree (and maybe
playing around with the Hakan factor) you're probably going to run into
problems with getting too many rows into a block on the initial insert and
running into problems with row migration.
There's also the question of multi-byte character sets - are you thinking of
your varchar2(N) declarations N bytes (the default assumption) or N characters
(which, depending on character set could mean up to 4N bytes).
Regards
Jonathan Lewis
On Sun, 21 Feb 2021 at 17:03, Pap <oracle.developer35@xxxxxxxxx> wrote:
Hi Listers, It's Oracle RDBMS version 11.2.0.4 exadata. We have a table with
structure as below which is going to be created as part of a new project. And
we want to predict the storage/space requirement for this. It may not be the
exact size but at least we want to estimate the AVG and MAXIMUM space
requirement for the table , if all the columns filled with not null values with
max column length being occupied/filled for each of the columns.
So to estimate the maximum space requirement , is it correct to Just add the
length of the column as it is in bytes and multiply it with the projected
number of rows. Something as below.