RE: FW: concepts document part about separating indexes and tablespaces

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <jkstill@xxxxxxxxx>
  • Date: Mon, 19 May 2008 15:29:19 -0400

I seem to be failing to enunciate my point clearly. Balance is *not* the
goal. Balance is data about the relative rates of throughput on various
devices.

 

Maximizing throughput at minimum cost is likely a part of your goal.

 

SAME is a relatively cheap way to nearly guarantee you won't have imbalance
sufficient to slow things down. It also nearly eliminates possibilities for
independent units of i/o to serve independent sources of i/o demand in a way
that reduces the service time of i/o.

 

BORING is a slightly more expensive way to preserve the independence of
units of i/o (if you have them to begin with) that makes elimination of hot
spots sufficient to reduce throughput pretty trivial if they happen and
unlikely in the first place.

 

Logical separation of objects with significant i/o into different
tablespaces facilitates capitalizing on i/o patterns that are known or
develop at the relatively low cost of moving files, rather than reloading
data.

 

It is enough to know occasional predominant access patterns to get a big win
from time to time with a quite low risk of a loss ever. Nothing approaching
a perfect world is required. Just knowing a little something about the i/o
signature of an application is often enough.

 

Logical separation on SAME hurts nothing, while logical separation on BORING
may from time to time produce a big win at low cost. Logical separation by
object type and size is usually a very good stand-in for preserving the
ability to cheaply improve throughput if you later discover a predominant
i/o pattern that is worth exploiting, even if you have only a single spindle
right now.

 

On the other hand, if you have a large existing database and someone is
trying to advocate rebuilding it merely to separate things by type, that is
a silly and wasteful idea. Since you have a running database you should be
able to observe if there are patterns of i/o worth going to the cost of
re-organization to exploit, and likely only a few objects (if any) will
justify the after the fact separation. And then also only if you have
independent units of underlying i/o on which to place the hot segments.

 

I expect some imbalance at maximum throughput. Almost certainly that means
that the hottest object you have is not spread on all the devices you own,
so the devices on which it resides may indeed be hotter than average. The
question is whether the service time and overall throughput is improved.

 

While severe imbalance can be a problem that frustrates the goal, balance
itself is not the goal (although it may appear as a performance metric on
the storage managers' review!)

 

 

  _____  

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Jared Still
Sent: Monday, May 19, 2008 12:37 PM
To: mwf@xxxxxxxx
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re: FW: concepts document part about separating indexes and
tablespaces

 

 

O

... Let's grant Richard's contention that it is unlikely you'll win on
performance of an individual index range scan query requiring table access.
Now consider dozens of users with queries completely satisfied by indexes
not competing with i/o for a full table scan in another tablespace. Unless
you're going to argue that a full table scan is never the right thing to do
and that none of your queries are satisfied from an index alone, I believe
that means you sometimes get a win from separation of data and indexes,
because statisitcally you will have less seek. If your underlying disk farm
is SAME, you'll minimize that win and protect yourself from the possibility
of unbalanced i/o generating sufficient queueing to degrade performance. ...
PS: as for finding support for any stance in the Oracle documents, they do
indeed call for SAME as a best practice, which effectively statistically
spreads everything on the underlying disk farm.


In a perfect world we might be able to design a system based on known access

patterns, and it will yield the most balanced IO possible.

SAME just eliminates a lot a problems in an imperfect world.

* few people have time for the type of analysis required to know all access
patterns,
  or even a significant portion of them.

* access patterns will change with changes in data.  These could be due to
new lines
  of business, acquisitions, divestitures, new reporting systems that query
the data 
  differently, application upgrades, ...

* the analysis ultimately fails when simultaneous sessions are exercising
different access patterns.


-- 
Jared Still
Certifiable Oracle DBA and Part Time Perl Evangelist

Other related posts: