Hi In 9i you had to enable Direct I/O at filesystem level, this is no longer true from 10g. This new behaviour is mentioned in http://download.oracle.com/docs/cd/B19306_01/server.102/b15658/toc.htm This is true I believe for AIX, Linux and Tru64. Regarding the effects of Direct I/O, the benefit is tremendous specially in a busy and I/O bounded system, I have suuccesfully rescued a few Linux systems just by enabling Direct I/O, before doing so the system hangs time to time due to kswapd kicks in and starts freeing memory, while this lasts the system didnt respond. Thanks -- LSC On Mon, Feb 14, 2011 at 6:56 AM, Gaja Krishna Vaidyanatha <gajav@xxxxxxxxx>wrote: > Hi Yong, > > Nice writeup. I have started a new thread in an effort to focus on the > topic of discussion. Yes, indeed on Linux with ext3 filesystems, setting > FILESYSTEMIO_OPTIONS to SETALL and not doing anything special at the > filesystem mount is enough to get direct I/O working. But as mentioned by me > in a prior note (in the previous thread), the same is not true for jfs or > vxfs, the last time I checked (a few months ago). You also bring a very > relevant point to the discussion, the distinction between page cache and OS > buffer cache. Let us for the purposes of our discussion, just say that the > *filesystem buffer cache* includes both the page cache and the OS buffer > cache. > > Also, the scenario you have depicted in your last paragraph is interesting. > My question to you - Have you observed what you are suggesting in your write > up on a production system? The reason why I bring this up, is that in my > experience the effects of NOT having direct I/O (with or without HugePages) > has had crippling effects on the entire system, in this case in RH Linux. > Similar behavior have also been observed and documented on other operating > systems. > > The following is pretty much a standard sequence of what has been observed > on RH Linux *without* direct I/O on very busy Oracle systems (100+ active > sessions at any given time): > > 1) As the load on the system increases, more sessions run more SQL and > eventually perform more I/O, thus increasing total I/O consumption on the > system. > > 2) This causes the OS to allocate more memory to the filesystem buffer > cache, in an effort to buffer future I/O requests (which may or may not > occur). The classic "*double buffering*" problem of data in the Oracle > database buffer cache and the filesystem buffer cache has now begun. > > 3) An increase in the memory footprint of the filesystem buffer cache > causes an increase in the activity of the paging daemon - the OS tries to > take preemptive action against a potential memory bottleneck. > > 4) As the I/O pressure increases, the paging daemon starts to work even > more aggressively (significantly increasing the number of pageins and > pageouts). Eventually, the paging daemon starts to get even more aggressive, > not only increasing its paging rates, but also swapping portions of the SGA > (even if it were locked) when the demand for memory increases. *A > system-wide Memory Bottleneck has now become a reality*. > > 5) This further increases the CPU consumption on the system (listed below > in #6) and also starts to create an artificial *I/O Bottleneck* (db file > sequential read and db file scattered read times increase exponentially, as > Oracle database blocks need to brought in from the OS's swap partitions). > OUCH!!! > > 6) This increase in paging daemon activity combined with the increase in > the filesystem buffer cache footprint, increases the overall CPU utilization > of the OS (verified with a continuous sar - u or vmstat). The "sar -u" > breaks down CPU consumption is in 4 buckets - %usr, %sys, %wio, %idle. * > %sys* will consistently and continuously increase. > > 7) This starts to create an artificial *CPU Bottleneck* as %sys in many > cases consumes a very significant percentage of total CPU on the system > (have observed *%sys consume 40% - 50% of total CPU*). > > 8) Imagine what havoc starts to unfold when the *OS consumes 50% of the > system's CPU resources* - *CPU Bottleneck* *becomes a reality*. Nobody > plans for the OS to consume 50% of configured CPU resources while engaging > in system sizing and capacity planning efforts. > > 9) Pretty soon, the CPU Bottleneck causes *Oracle sessions start to > timeout*. > > 10) Depending on the application vendor and their connection pool > mechanics, the application then starts to re-try connections to the > database, which starts a *connection storm to the database*. In some > cases, the database is completely hung-up with various session/login-related > latch issues. > > 11) Sooner than later, the database comes to a grinding halt (with or > without a crash), and the OS eventually may come to a grinding halt by > imploding with a *kernel panic*. Even this did not occur, you pretty much > have an unusable Oracle database at this stage of the game. > > The above is pretty much a "poster child storyline" of very busy Oracle > systems with hundreds of concurrent sessions that do not use direct I/O. > Again, this is my experience and I'm curious if you or others on this list > have observed anything to the contrary. If so, please provide the relevant > details, as I would love to learn alternative system behavioral patterns > without direct I/O. It is something that interests me quite a bit and I'm > very curious. Thanks for taking the time to read this long note :) > > Cheers, > > Gaja > > Gaja Krishna Vaidyanatha, > Founder/Principal, DBPerfMan LLC > http://www.dbperfman.com > Phone - 001-(650)-743-6060 > Co-author:Oracle Insights:Tales of the Oak Table - > http://www.apress.com/book/bookDisplay.html?bID=314 > Co-author:Oracle Performance Tuning 101 - > http://www.amazon.com/gp/reader/0072131454/ref=sib_dp_pt/102-6130796-4625766 > > > ------------------------------ > *From:* Yong Huang <yong321@xxxxxxxxx> > *To:* Freek.DHooge@xxxxxxxxx > *Cc:* oracle-l@xxxxxxxxxxxxx > *Sent:* Sun, February 13, 2011 10:41:30 AM > *Subject:* RE: How much RAM is to much > > > But I always thought that setting the filesystemio_options parameter > > to directIO or setall caused the processes to open the files with > > the O_DIRECT flag. > > I can confirm what Freek said, on my Oracle 10.2.0.4, Red Hat 5 kernel > 2.6.18-194.17.1.el5 architecture x86_64 if these details matter. The > filesystem is ext3, mounted with no option (i.e. everything default). > When the process opens a datafile for the first time, I can see the > flags passed to open() call in strace are O_RDWR|O_SYNC|O_DIRECT. > > By the way, ever since page cache is introduced to the file system, > we should make a distinction between buffer cache and page cache. What's > interesting here is page cache, because buffer cache caches file system > metadata. Well, in Linux, I believe buffer cache is physically inside > page cache. So you may call those buffers "buffer pages". Confusing name > indeed. But to make it short, as soon as you have the word "buffer" in > it, people knowing Linux kernel (not me!) will think you're referring to > the cache for metadata, not for file content, which may be what you want > to say. > > Last point. While Oracle manages data in memory through a better > mechanism because it knows the data better, leaving the decision of > caching/not caching to the file system is not that dreadful. In some > cases, it may even be a big help. So instead of assigning most memory to > Oracle buffer cache and using direct I/O, someone can make a comprehensive > test to see if locking a large portion of RAM to SGA (through HugePages > etc.) and leaving file system to NOT direct I/O will be even better in > some cases. If a table slightly bigger than 2% of the buffer cache is > frequently full-scanned, and you didn't do anything special about it, I > think it's better to let the file system cache its content. > > Yong Huang > > > > -- > //www.freelists.org/webpage/oracle-l > > >