Re: Interesting Christmas Eve listener problem

  • From: "Alex Gorbachev" <gorbyx@xxxxxxxxx>
  • To: mgogala@xxxxxxxxxxx
  • Date: Mon, 25 Dec 2006 14:48:46 -0500

Which reminds you - don't forget to cron your log file rotation
scripts. And look at the bright side - if everything becomes so easy,
why would someone hire people who able to resolve those weird things
on their own instead of waiting the SR passed round the globe few
cycles before somebody with a clue gets it?
Anyway, Merry Xmas. :-)


On 12/25/06, Mladen Gogala <mgogala@xxxxxxxxxxx> wrote:
Last night, I was paged because the listener on my RMAN catalog database died
and could not have been restarted. The error was, as it came to be expected,
extremely instructive and revealing:

bash-2.05$ lsnrctl start

LSNRCTL for Linux: Version 9.2.0.8.0 - Production on 24-DEC-2006 20:08:55

Copyright (c) 1991, 2006, Oracle Corporation. All rights reserved.

Starting /ORACLE/app/oracle/product/920/bin/tnslsnr: please wait...

TNS-12547: TNS:lost contact
TNS-12560: TNS:proto adapter error
TNS-00517: Lost contact
Linux Error: 32: Broken pipe
bash-2.05$

I opened the TAR, but I solved the problem by the time I got response. The 
problem was solved
by using strace, Linux equivalent of the Slowaris "truss" utility. Here are the 
last few lines
that revealed the problem:

4605  open("/ORACLE/app/oracle/product/920/network/log/listener.log", 
O_WRONLY|O_APPEND|O_CREAT, 0666) = 6
4605  fstat64(6, {st_mode=S_IFREG|0644, st_size=2147483647, ...}) = 0
4605  old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x40bd5000
4605  fstat64(6, {st_mode=S_IFREG|0644, st_size=2147483647, ...}) = 0
4605  _llseek(6, 2147483647, [2147483647], SEEK_SET) = 0
4605  fcntl64(0x6, 0x2, 0x1, 0x2)       = 0
4605  write(6, "\nTNSLSNR for Linux: Version 9.2."..., 74) = -1 EFBIG (File too 
large)
4605  --- SIGXFSZ (File size limit exceeded) ---
4603  <... read resumed> "", 64)        = 0
4603  close(5)                          = 0
4603  close(6)                          = 0
4603  lseek(3, 39936, SEEK_SET)         = 39936
4603  read(3, "\r\0\3730\0\0V\0\3740\0\0t\0\3750\0\0\233\0\3760\0\0\252"..., 
512) = 512
4603  lseek(3, 40448, SEEK_SET)         = 40448
4603  read(3, "\20\0\0101\0\0h\0\n1\0\0\205\0\v1\0\0\257\0\f1\0\0\304"..., 512) 
= 512
4603  lseek(3, 14848, SEEK_SET)         = 14848
4603  read(3, "\16\0\1\2\0\0\\\0\2\2\0\0x\0\3\2\0\0\243\0\4\2\0\0\336"..., 512) 
= 512
4603  write(1, "TNS-12547: TNS:lost contact\n TNS"..., 125) = 125
4603  munmap(0x40bd4000, 4096)          = 0
4603  _exit(1)                          = ?

For those not adept at reading strace gobbledy-gook, listener opened listener 
log (the first line),
tried to write and got an error EFBIG ("File too freaking large") and a signal 
SIGXFSZ which killed
the process. Control utility (the line after "--- SIGXFSZ (File size limit exceeded) 
---") then tried
read what the spawned process has written to pipe, saw it was empty ("read 
resumed" line), closed all
the file descriptors, wrote a meaningless error messages and died. Would it be 
too much to kindly ask
Oracle to add a @#$%! signal handler telling me that I have to clean up the 
@#$%! log file? I hate debugging
on Christmas Eve!

Of course, after a delicate application of "rm -f" command on the listener.log, 
listener started working
like a charm. RMAN catalog was created in 2003 and listener was up, all this 
time.

Happy holidays to everybody!


--
Mladen Gogala
http://www.mladen-gogala.com

--
//www.freelists.org/webpage/oracle-l





--
Best regards,
Alex Gorbachev

The Pythian Group
Sr. Oracle DBA

http://www.pythian.com/blogs/author/alex/
http://blog.oracloid.com
--
//www.freelists.org/webpage/oracle-l


Other related posts: