Re: os cache vs. db cache

  • From: "Christo Kutrovsky" <kutrovsky.oracle@xxxxxxxxx>
  • To: robyn.sands@xxxxxxxxx
  • Date: Wed, 19 Dec 2007 12:42:07 -0500

Joining a bit late into this converation.

Here's my opinion on OS vs Oracle cache vs PGA target. Strictly with
linux in mind.

I would *always* go for the Oracle cache. The reason for this is not
so much the more efficient hot/cold management.
- The Oracle cache is more efficient. Using the Oracle cache reduces
your CPU usage (your *licensed* cpu) as well as is more concurent then
the OS cache. The Oracle cache can be used on the spot, there is no
extra work associated, just use it.
- I can monitor the Oracle cache, i can see which objects are
consuming it, how much dirty data and etc, all on a very granual level
- I get to have *real* disk responce time statistics. As opposed to a
blended rate due to OS cache
- I have more control of what goes into cache and when. I have control
how fast to write out dirty data to disk.

Same goes for the PGA. I would always give more PGA then OS cache. Of
course, there's the special case where each "work area" can be at most
4gb (even on 64 bit) but that can be worked arround using parallelism,
where each parallel process can get it's 4gb.
- I get to see which areas are taking up space, which areas spilled to
disk, which queries are cached which are not.
- Perhaps I can the the ones that are causing disk activity - you
loose all this info with a blended OS cache.
- It's more efficient, no extra CPU work required
- It avoids double buffering - you will always have double buffering
if you rely on OS cache.
- The last thing you want on a CPU intensive task is more CPU to be used.

Now mind you not, the default settings for PGA and sort/hash related
writes are very unfriendly for the datawarehouse, but tunning these
you can achieve some exceptional performance. Up to 2x on a going to
TEMP simply by tunning the IO patterns. SAN cache or not.

I just cant see the case where the OS cache would be more advantegous.

Christo Kutrovsky
DBA Team Lead
The Pythian Group


On 7/10/07, Robyn <robyn.sands@xxxxxxxxx> wrote:
> Jonathan,
>
> Thank you very much for your reply.  One of the systems in question is a
> warehouse and the issues you mention are very relevant.  This database has
> been around since '97 or so, and the range of queries and reports is huge,
> so of course, the range of plans and methods for data access are varied as
> well. I have been working on tuning some of the key queries, and I am seeing
> more queries use direct access reads, but it is not yet the primarily access
> method.  However, I'm thinking the best approach for this database is to
> continue tuning for more parallelization and more direct reads, and perhaps
> increasing the os cache and decreasing the database cache will be effective
> after we've made more progress on that front.  It's certainly something I
> hope to see although I've got quite a bit of work to do first.
>
> However, I'm just getting starting on moving warehouse data into partitioned
> tables and indexes (just got licensing for this option recently) so again we
> don't currently have a lot partition index reads now but they should be
> increasing soon.  If partitioned indexes continued to use the buffer cache,
> I suppose this could change the mix as well.  Either way, I've got a lot of
> work to do first ...
>
> Our temporary space is already on a cached files system although I expect
> that was by accident, so I'm glad to hear that we may be benefiting from
> this configuration.  I'll check into the PGA aggregate target suggestion;
> hashing and sorting is a problem area for several of the queries I'm
> concerned about this week and I'd noticed a decline in PGA cache ratio from
> where it was about 3 months ago.  We have a year end closing coming up and
> our revenue data will be extremely active in the next few weeks.
>
> Thank you again ...
>
> Robyn
>
>
> On 7/10/07, Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Robyn,
> >
> > Since you're running a data warehouse, you may want to consider
> > the effects of parallel tablescans and parallel index fast full scans,
> > which do direct path reads and bypass the buffer cache.
> >
> > Of course, there are parallel execution paths that still USE
> > the buffer cache (parallel nested loop joins with indexed access,
> > access through partitioned indexes etc.) but if a very large
> > percentage of the work you do is bypassing the Oracle cache,
> > you may want to switch memory to the file system cache instead
> > to try and keep some important objects "close to" the action.
> >
> > Alternatively, you might think about putting the temporary tablespace
> > onto a filesystem that has caching enabled so that blocks written to the
> > temp tablespace are in the cache (and not on disc) when you start to
> > re-read them.
> >
> > Yet another option is to keep the caching low - depending on a SAN
> > cache (for exanple) to help you with the TEMP space - and make as
> > much memory as you can available for PGA activity so that you minimise
> > the number of cases where sorts or hash joins spill to disc.
> >
> > It's not a clear cut choice, unfortunately - often it's a question of
> observing
> > the current activity and making the best guess about how moving memory
> > around will benefit the critical processes.
> >
> >
> >
> >
> > Regards
> >
> > Jonathan Lewis
> > http://jonathanlewis.wordpress.com
> >
> > Author: Cost Based Oracle: Fundamentals
> > http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html
> >
> > The Co-operative Oracle Users' FAQ
> > http://www.jlcomp.demon.co.uk/faq/ind_faq.html
> >
> >
> > ----- Original Message -----
> > > Date: Mon, 9 Jul 2007 14:03:42 -0400
> > > From: Robyn <robyn.sands@xxxxxxxxx >
> > > Subject: os cache vs. db cache
> > >
> > > Hello everyone,
> > > Over the years, I've read differing opinions on balancing the os cache
> and
> > > the database cache.  Seems like I remember there were some advocates for
> > > minimizing the os cache and maximizing the database cache to allow
> Oracle to
> > > better determine which blocks should truly be cached and avoid necessary
> > > blocks being cached because they were in proximity to hot data.  I also
> have
> > > heard opposite opinions.
> > >
> > > I'm specifically focused on performance for a data warehouse.  Has
> anyone
> > > tested performance with a minimized OS cache and a maximized database
> > > cache?  If so, what were the results?  Does anyone have a really good
> paper
> > > or book on the topic?  I'm trying to collect some information before I
> start
> > > discussions with the platforms guys.
> > >
> > > tia ... Robyn
> > >
> >
> > --
> > //www.freelists.org/webpage/oracle-l
> >
> >
> >
>
>


-- 
Christo Kutrovsky
Senior Database/System Administrator
The Pythian Group - www.pythian.com
I blog at http://www.pythian.com/blogs/
--
//www.freelists.org/webpage/oracle-l


Other related posts: