Which reminds you - don't forget to cron your log file rotation scripts. And look at the bright side - if everything becomes so easy, why would someone hire people who able to resolve those weird things on their own instead of waiting the SR passed round the globe few cycles before somebody with a clue gets it? Anyway, Merry Xmas. :-) On 12/25/06, Mladen Gogala <mgogala@xxxxxxxxxxx> wrote:
Last night, I was paged because the listener on my RMAN catalog database died and could not have been restarted. The error was, as it came to be expected, extremely instructive and revealing: bash-2.05$ lsnrctl start LSNRCTL for Linux: Version 9.2.0.8.0 - Production on 24-DEC-2006 20:08:55 Copyright (c) 1991, 2006, Oracle Corporation. All rights reserved. Starting /ORACLE/app/oracle/product/920/bin/tnslsnr: please wait... TNS-12547: TNS:lost contact TNS-12560: TNS:proto adapter error TNS-00517: Lost contact Linux Error: 32: Broken pipe bash-2.05$ I opened the TAR, but I solved the problem by the time I got response. The problem was solved by using strace, Linux equivalent of the Slowaris "truss" utility. Here are the last few lines that revealed the problem: 4605 open("/ORACLE/app/oracle/product/920/network/log/listener.log", O_WRONLY|O_APPEND|O_CREAT, 0666) = 6 4605 fstat64(6, {st_mode=S_IFREG|0644, st_size=2147483647, ...}) = 0 4605 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40bd5000 4605 fstat64(6, {st_mode=S_IFREG|0644, st_size=2147483647, ...}) = 0 4605 _llseek(6, 2147483647, [2147483647], SEEK_SET) = 0 4605 fcntl64(0x6, 0x2, 0x1, 0x2) = 0 4605 write(6, "\nTNSLSNR for Linux: Version 9.2."..., 74) = -1 EFBIG (File too large) 4605 --- SIGXFSZ (File size limit exceeded) --- 4603 <... read resumed> "", 64) = 0 4603 close(5) = 0 4603 close(6) = 0 4603 lseek(3, 39936, SEEK_SET) = 39936 4603 read(3, "\r\0\3730\0\0V\0\3740\0\0t\0\3750\0\0\233\0\3760\0\0\252"..., 512) = 512 4603 lseek(3, 40448, SEEK_SET) = 40448 4603 read(3, "\20\0\0101\0\0h\0\n1\0\0\205\0\v1\0\0\257\0\f1\0\0\304"..., 512) = 512 4603 lseek(3, 14848, SEEK_SET) = 14848 4603 read(3, "\16\0\1\2\0\0\\\0\2\2\0\0x\0\3\2\0\0\243\0\4\2\0\0\336"..., 512) = 512 4603 write(1, "TNS-12547: TNS:lost contact\n TNS"..., 125) = 125 4603 munmap(0x40bd4000, 4096) = 0 4603 _exit(1) = ? For those not adept at reading strace gobbledy-gook, listener opened listener log (the first line), tried to write and got an error EFBIG ("File too freaking large") and a signal SIGXFSZ which killed the process. Control utility (the line after "--- SIGXFSZ (File size limit exceeded) ---") then tried read what the spawned process has written to pipe, saw it was empty ("read resumed" line), closed all the file descriptors, wrote a meaningless error messages and died. Would it be too much to kindly ask Oracle to add a @#$%! signal handler telling me that I have to clean up the @#$%! log file? I hate debugging on Christmas Eve! Of course, after a delicate application of "rm -f" command on the listener.log, listener started working like a charm. RMAN catalog was created in 2003 and listener was up, all this time. Happy holidays to everybody! -- Mladen Gogala http://www.mladen-gogala.com -- //www.freelists.org/webpage/oracle-l
-- Best regards, Alex Gorbachev The Pythian Group Sr. Oracle DBA http://www.pythian.com/blogs/author/alex/ http://blog.oracloid.com -- //www.freelists.org/webpage/oracle-l