Mythical space savings of IOTs
- From: Bill Coulam <bcoulam@xxxxxxxxx>
- To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
- Date: Fri, 8 Jul 2005 14:57:03 -0500
Most documents on IOTs allude to space savings that can be realized by
eliminating the extra storage required by columns in both the table
and the PK. They hint at more savings that can be achieved by using
index compression. I have not been able to approach the space savings
I was hoping for and was wondering if anyone would care to comment on
their experiences with IOTs being 1 to 3X bigger than the object they
were meant to replace.
I've used IOTs in the past on small, PK-driven lookup tables, but
never on really big tables.
Background:
In our industry, we have some entities that contain what amounts to
order line items, and order line item details . The details table
(called components) is in billions of rows (terabtyes) at our clients.
The components table is modeled in a "vertical" fashion. Each row is
composed of about 7 fields that make up the unique key, followed by a
text, numerical or date value, followed by a handful of optional,
nullable columns. This allows our application to be flexible as to the
number and type of components allowed for each line item, which in
turn allows us to resell the app within different markets with
different business rules without modification.
On a 1Million row CTAS sample of the components table, the table
required about 59MB (from user_segments), and the composite UK
required about 50MB (the UK is almost all the rows in the table).
There's another global, single column index that takes up another
18MB. So 127MB in all.
Since our largest client used to have an in-house,
specific-to-their-company, "flat" component structure that took up
much less room (where each component is another column on a single
line item row), they are demanding a 60-80% decrease in storage space
required by our app.
We've looked at some other improvements that should net us about 27%
savings. But the big win I was hoping for was rebuilding the table as
an IOT so that 1M rows would only take up about 55MB, instead of 127MB
(the 18MB global index would be moot and dropped for the IOT).
After playing with everything I can under the ORGANIZATION INDEX
syntax, the best I've been able to do is an IOT that says it is using
88MB. So I've hit 31% savings. Not quite what I was hoping. Doing
ANALYZE INDEX...VALIDATE STRUCTURE was key for me in that it I wasn't
able to get the IOT under 128MB until I used the OPT_CMPR_COUNT value
in index_stats to lower the index key compression. That's when I
finally dropped to 88MB. Nothing else I've tried, from PCTFREE,
PCTTHRESHOLD, INCLUDING, using /*+ APPEND */ to load vs. not, and
PCTFREE and PCTUSED on the empty overflow segment have yielded any
further savings. I noticed in index_stats, that my IOT is only using
47% of the BTREE_SPACE allocated to the PK/IOT segment. Why? I'm not
that familiar with block internals, so feel free to school me. If
there were a way to make the index use most of each index block, I'd
be down to 40MB, which is exactly where I want to be.
If it's helpful, here is an altered version of the DDL I'm using in my tests:
CREATE TABLE our_component_table
(
line_id NUMBER NOT NULL,
component_id NUMBER NOT NULL,
point_id NUMBER NOT NULL,
counterpoint_id NUMBER,
start_date DATE NOT NULL,
end_date DATE NOT NULL,
seldom_used_val VARCHAR2(100),
num_val NUMBER,
txt_val VARCHAR2(30),
dt_val DATE,
optional_col_1 NUMBER, -- NULL at most clients
optional_col_2 VARCHAR2(100), -- NULL at most clients
parent_point_id NUMBER, -- NULL at most clients
optional_col_3 VARCHAR2(50), -- NULL at most clients
optional_col_4 NUMBER, -- NULL at most clients
CONSTRAINT nmsc_pk PRIMARY KEY (line_id, component_id, start_date,
point_id, counterpoint_id, seldom_used_val, optional_col_1,
optional_col_2, optional_col_3, optional_col_4)
)
ORGANIZATION INDEX
PCTFREE 5
PCTTHRESHOLD 5
COMPRESS 3
TABLESPACE our_tablespace
;
On the real table, it is partitioned by month on start_date, and for
this test, I'm using a client-specific version of the large UK to
eliminate the unused columns, which is why I can get 40MB of
USED_SPACE instead of the expected 50-55MB.
My biggest complaint is that simply switching from a regular table to
an IOT should have saved me at least the redundant space used by the
table, minus a little overhead for logical ROWID, etc. Instead, with
no key compression and no other indexes but the large PK, adding
ORGANIZATION INDEX to the same DDL, yielded a segment that was over
twice as large as the original heap table OR the original B-TREE UK
alone. That's not what I call eliminating the redundant data storage
problem.
Thoughts?
--
bill coulam
bcoulam@xxxxxxxxx
--
http://www.freelists.org/webpage/oracle-l
- Follow-Ups:
- Re: Mythical space savings of IOTs
- From: Bill Coulam
Other related posts:
- » Mythical space savings of IOTs
- » RE: Mythical space savings of IOTs
- » Re: Mythical space savings of IOTs
- » Re: Mythical space savings of IOTs
- » Re: Mythical space savings of IOTs
- » Re: Mythical space savings of IOTs
- Re: Mythical space savings of IOTs
- From: Bill Coulam