Thanks, Rajeev. I enclose my answers below. Pawel On 2009-05-16 00:08, Rajeev Prabhakar wrote:
Pawel, So that I can understand your environment better, could you please answer these questions : a) Does the database server's /tmp filesystem have adequate free space ?
Free space on /tmp has not dropped below 3G at any time.
b) Is the oracle installation configured with adequate swap space ( I know no sysadmin would like to have additional swap allocated), however, going with typical oracle specified swap settings have saved me a lots of headaches (while load/stress testing) (including node freeze/reboots).
It is not... We have 32 GB of RAM and 6 GB of swap. Oracle requirement is not enough to convince sysadmins to allocate more swap. I will make another try. I also read Andrew Kerber's blog article (http://dbakerber.wordpress.com/2008/04/18/swap-space-in-linux-and-unix/) . Do you know how exactly Oracle uses/allocates swap space in Linux? Is there perhaps some Metalink/Web article that I could show to system administrators?
c) What is the size of your temp tablespace ? (Allocated and Free space during typical application usage window) ?
It is 512G extensible to 1.5T. The normal usage is around 100-200G, the peak over the last week - around 400GB.
d) Are you (or your application) utilizing GTTs (global temporary tables) ?
No. Temporary tablespace is mainly used for large sorts/joins.
e) Are you utilizing huge pages on your database server ?
We are not. The SGA is relatively small. Most reads bypass the Oracle cache.
f) Is this dedicated OR shared server configuration ?
There are dedicated connections only.
g) Have you looked at the prospect of any soft block corruption for any table ?
Every weekend I run a full cold backup with check logical. It does not report any errors.
I) Any i/o bound concurrent processes (database specific or otherwise) running during the window when database freeze is being observed ? say - backups (database OR o.s.)
There are gzip processes compressing large files that are part of the processing.
j) Are there any specific kinds of processes that are first reported to be getting hung / experiencing slowness (e.g. reports/html output etc..) ?
These are CTAS statements joining several large tables.
-Rajeev 2009/5/15 Paweł Kotlarz <pkotla@xxxxxx>:Rajeev, Oracle shows many sessions waiting for direct path read (temp). Tanel's waitprof reports single events taking many seconds though most of them are below 15ms. On the OS level vmstat shows normal reading for some time and then sessions in an uninterruptible sleep with no I/O taking place. iostat -x and asmiostat (ML 437996.1) show specific volumes. Just after the performance returns to normal these volumes show much greater queue length (iostat) or much greater average read time (asmiostat). I ran strace on a process servicing the session on which I used waitprof earlier. It stops on a read call. Currently I only know that the sysadmins found nothing in Linux logs and on a 'system management page'. Unfortunately it is difficult to obtain more information from them unless I tell what exactly to check... Thanks, Pawel