FW: concepts document part about separating indexes and tablespaces

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 19 May 2008 09:30:03 -0400

<snipped> was too big.

 

  _____  

From: Mark W. Farnham [mailto:mwf@xxxxxxxx] 
Sent: Monday, May 19, 2008 9:07 AM
To: 'patrick.roozen@xxxxxxxxx'; 'mary_mcneely@xxxxxxxxx'
Cc: 'oracle-l@xxxxxxxxxxxxx'
Subject: RE: concepts document part about separating indexes and tablespaces

 

Some flies for your ointment:

 

1)       Are all your applications equally important with regard to uptime?
If you have one application that is more important to be up, you need to ask
yourself whether you can rebuild the indexes for that application quicker
than you can restore all the tables and indexes. If you don't know the
answer to that question, then you have decided that thinking about improving
beyond your mean time to recover media and roll forward is uneconomic for
your site for that particular database. That may indeed be a valid point of
view. I have yet to be informed of a site that can justify that stance and
is also hurt in any way by separation of data from indexes. (Notice I wrote
data: that includes all variety of tables (including index oriented TABLEs)
and clusters, and user defined objects that are the source of data.

2)       I find it amusing that the single threaded analysis of an index
range scan is conflated with systems being multi-user in a clear
demonstration that an individual index range scan requiring table access can
only be faster in a carefully constructed case such as a table's data being
physically ordered in the table by the most frequently accessed index path
with the index being too big to cache. Let's grant Richard's contention that
it is unlikely you'll win on performance of an individual index range scan
query requiring table access. Now consider dozens of users with queries
completely satisfied by indexes not competing with i/o for a full table scan
in another tablespace. Unless you're going to argue that a full table scan
is never the right thing to do and that none of your queries are satisfied
from an index alone, I believe that means you sometimes get a win from
separation of data and indexes, because statisitcally you will have less
seek. If your underlying disk farm is SAME, you'll minimize that win and
protect yourself from the possibility of unbalanced i/o generating
sufficient queueing to degrade performance. If you have multiple independent
units of i/o to which you can assign separation of batch oriented operation
(BORING), then your i/o may indeed be unbalanced, but you also have the
opportunity to minimize seek and maximize the throughput of parallel
operations processing (not to be confused with parallel degree sql
statements). It also becomes your job to avoid overallocation of i/o demand
to a particular independent unit of i/o. I've never found that very hard
unless the underlying storage farm construction was HAPHAZARD.

3)       Dirty blocks are batched for write. Think about what happens in
bursts of inserts.

4)       In many systems the total i/o to indexes is greater than the total
i/o to tables. Please don't come back with that straw man; indeed you may
need to use more i/o capability for indexes than for data. Your mileage may
vary.

5)       I think the interesting question Richard raises is whether adding
i/o capacity is better done by statistically adding the i/o bandwidth to all
i/o versus adding it in some other way. On that I believe your mileage will
vary. Other persons contend that i/o balance is identical to maximum
throughput, so adding i/o bandwidth statistically in a statmux SAME fashion
will always win. You can only execute that experiment if you have
independent units of i/o configured.

 

Regards,

 

mwf

PS: as for finding support for any stance in the Oracle documents, they do
indeed call for SAME as a best practice, which effectively statistically
spreads everything on the underlying disk farm.

 

<snip>

 

Other related posts: