Re: linux memory: limiting filesystem caching

Zhu,

How did you measure the performance differences when using bigpages
and not using them? What kind of tests did you run ? Were they CPU or
disk IO specific ? Do you have statistics of CPU usage in "user" mode
and "system" mode?

We have at least 3 cases where converting a production system to
bigpages/hugepages was the difference between a hung system and a well
performing system.

As for memory savings, what counter-arguments do you have? Keep in
mind that PTE pointers are allocated as needed, and if your sessions
are touching the entire SGA, for example as often happens, they
reconnect often.

I agree that it doesn't sound reasonable, thus the hugepages solution.


-- 
Christo Kutrovsky
Database/System Administrator
The Pythian Group



On 7/13/05, zhu chao <zhuchao@xxxxxxxxx> wrote:
> It is nice to people using bigpage/hugepages.
> 
> I played in on my production RAC box before (now it is gone), actually
> I did not see much performance gains.  I installed my RAC with no
> hugepage/bigpage(I was using bigpage as I was running 2.1 AS) and
> later I converted one node using bigpage and the other node no change.
> No significant CPU usage change and VM usage change. Later I converted
> both nodes to bigpage, no significant performance change.
> 
> The server was loaded (4 cpu box each, load average 2-3 most of the time).
> 
> As regarding your bigpage save that huge of memory, I don't have that
> kind of theory, but using 2GB kernel memory to manage 2GB does not
> sound reasonable. Maybe other linux expert can give their opinions.
> 
> 
> On 7/13/05, Christo Kutrovsky <kutrovsky.oracle@xxxxxxxxx> wrote:
> > Hello Teehan,
> >
> > You don't mention which RH version you have, but assuming 3.0 advanced 
> > server.
> >
> > as zhu chao mentioned, /proc/sys/vm/pagecache is the parameter you need.
> >
> > My recommendation is to give at most 50% to file caching.
> >
> > vi /etc/sysctl.conf
> >
> > and add:
> > vm.pagecache=10 50 50
> >
> > then run "sysctl -p" to apply. That way on next boot those will be in 
> > effect.
> >
> > You can monitor with "vmstat 2" in another session if the memory for
> > "cache" drops and "free" gets high.
> >
> > In addition, on LINUX you should always use hugepages (hugetlbpool)
> > for Oracle. That way Oracle's memory is locked in physical RAM and is
> > not managed (almost invisible) by the linux memory manager, thus
> > reducing memory management cost.
> >
> > Also the benefits of the big pages (chunks of 2mb) will reduce your
> > overal memory usage, especially if you have a lot sessions. Some
> > simple math:
> >
> > SGA of 2 gb , in 4 Kb pages = 524288 pages * 8 bytes per pointer (i
> > think) = 4 Mb per process for the page pointers. If you have 500
> > sessions then that's 500 * 4 mb = 2 Gb of memory to manage 2 gb of
> > memory.
> >
> > Compared to SGA of 2 gb in 2 Mb pages = 1024 pages * 8 bytes = 8 Kb
> > per process. For the same 500 sessions you will use 4 Mb of memory to
> > manage 2 gb of memory. A significant improvment.
> >
> > Also the CPU has only so many entries in the virtual to physical
> > memory map cache, thus having that many less pages will improve
> > significantly the hit ratio of your virtual to physical mapings.
> >
> > So simply by using hugepages you:
> > - reduce memory for PTE pointers by a factor of 512 (2 gb for 500 sessions)
> > - lock Oracle's SGA in physical memory
> > - reduce the memory management costs for the linux kernel
> > - improve the CPU cache for virtual to physical mapings
> > - reduce the amount of memory that you have to touch overal
> >
> > I hope this helps.
> >
> >
> > --
> > Christo Kutrovsky
> > Database/System Administrator
> > The Pythian Group
> >
> > On 7/13/05, Teehan, Mark <mark.teehan@xxxxxxxx> wrote:
> > > Hi all
> > > I have several redhat blade clusters running 10.1.0.4 RAC on 
> > > 2.4.9-e.43enterprise. All database storage is OCFS, with ext3 for 
> > > backups, home dirs etc. The servers have 12GB of RAM, of which about 2GB 
> > > is allocated to the database, which is fine. Linux, in its wisdom, uses 
> > > all free memory (10GB in this case) for filesystem caching for the non 
> > > OCFS filesystems (since OCFS uses directIO); so every night when I do a 
> > > backup it swallows up all available memory and foolishly sends itself 
> > > into a swapping frenzy; and afterwards it sometimes cannot allocate 
> > > enough free mem for background processes. This seems to be worse on e43; 
> > > I was on e27 until recently. Does anyone know how to control filesystem 
> > > block caching? Or how to get it to de-cache some? For instance, I have 
> > > noticed that gziping a file, then ctrl-C'ing it can free up  a chunk of 
> > > RAM, I assume it de-caches the original uncompressed file. But its not 
> > > enough!
> > >
> > > Rgds
> > > Mark
> > >
> > > ==============================================================================
> > > Please access the attached hyperlink for an important electronic 
> > > communications disclaimer:
> > >
> > > http://www.csfb.com/legal_terms/disclaimer_external_email.shtml
> > >
> > > ==============================================================================
> > >
> > > --
> > > http://www.freelists.org/webpage/oracle-l
> > >
> >
> >
> > --
> > Christo Kutrovsky
> > Database/System Administrator
> > The Pythian Group
> > --
> > http://www.freelists.org/webpage/oracle-l
> >
> 
> 
> --
> Regards
> Zhu Chao
> www.cnoug.org
> 


-- 
Christo Kutrovsky
Database/System Administrator
The Pythian Group
--
http://www.freelists.org/webpage/oracle-l

Other related posts: