Re: missing alert.log mystery (it's not what you think)

  • From: Charles Schultz <sacrophyte@xxxxxxxxx>
  • To: Robert Freeman <robertgfreeman@xxxxxxxxx>
  • Date: Sun, 15 May 2011 20:48:04 -0500

Thanks for clarifying that, Robert. :) I didn't want to have to spell that
out, again.

So to be totally honest, no, I did not check to make sure all the processes
had actually "gone away". In my experience, it is extremely rare and
infrequent to find lingering processes after a shutdown - although the
possibility is real, so perhaps I should have checked. However, when I
searched *all* the file descriptors under /proc, I felt like I had covered
all my bases. *grin*

But again, even if root had, for whatever reason, managed to open the
alert.log and keep it open - even then, would not changing the diag_dest
parameter get around that? Especially if I create a new directory and change
the parameter to that directory? Perhaps my understanding of Solaris is a
tad shallow, but I cannot image that an inode would be reused for a
completely new directory structure.

I don't mind the wags. Perhaps one of these wags will be just enough to help
me solve the riddle. I mean, this is so bizarre that Occam's razor is
starting to itch at me and I am wondering what I missed.

On Sun, May 15, 2011 at 20:23, Robert Freeman <robertgfreeman@xxxxxxxxx>wrote:

> But Charles said that they shutdown the database, changed the diag_dest
> parameter and nothing still happened. Shutting down the database should have
> released any files...
>
> Charles, after you shutdown the database, did you ensure that all processes
> were shutdown (ps -ef|grep) before you restarted? Is it possible you didn't
> get a clean shutdown and therefore the old alert log file was not released
> and Oracle could not open the new location... (just a WAG there).
>
> Robert G. Freeman
> Master Principal Consultant, Oracle Corporation, Oracle ACE
> Author of various books on RMAN, New Features and this shorter signature
> line.
> Blog: http://robertgfreeman.blogspot.com
>
> Note: THIS EMAIL IS NOT AN OFFICIAL ORACLE SUPPORT COMMUNICATION. It is
> just the opinion of one Oracle employee. I can be wrong, have been wrong in
> the past and will be wrong in the future. If your problem is a critical
> production problem, you should always contact Oracle support for assistance.
> Statements in this email in no way represent Oracle Corporation or any
> subsidiaries and reflect only the opinion of the author of this email.
>
>
> ------------------------------
> *From:* Howard Latham <howard.latham@xxxxxxxxx>
> *To:* sacrophyte@xxxxxxxxx
> *Cc:* ORACLE-L <oracle-l@xxxxxxxxxxxxx>; Wolfgang Breitling <
> breitliw@xxxxxxxxxxxxx>
> *Sent:* Sun, May 15, 2011 1:42:20 PM
> *Subject:* Re: missing alert.log mystery (it's not what you think)
>
> I expect the file is still there but lost. Anyone tried deleting the alert
> log while a db is writing to it?
>
> On 15 May 2011 19:47, Charles Schultz <sacrophyte@xxxxxxxxx> wrote:
>
>> Actually, what bothers me most is that even if I change diagnostic_dest,
>> there is absolutely not alert.log whatsoever. There is the log.xml file in
>> the alert directory, and there are process trace files in the trace
>> directory, but no alert.log in the new structure. Why?!?
>>
>>
>> On Sun, May 15, 2011 at 13:37, Charles Schultz <sacrophyte@xxxxxxxxx>wrote:
>>
>>> Wolfgang,
>>>
>>> *grin* The mount holding the diag directory structure is not full, nor
>>> has it been for some time. I have indeed checked elsewhere for the
>>> alert.log. In fact, I even did a massive 'find /u01 -type f -exec grep -iln
>>> <unique redo log name> {} \;', and that did not show me any suspect files.
>>>
>>> To others that asked questions privately, the ownership/permissions of
>>> the files and directories have not changed to my knowledge. Other alert.logs
>>> in the same diag directory hierarchy (obviously under their own $SID) are
>>> active and updated by the respective databases. The trace directory in
>>> question is constantly updated with other trace files (ie, background
>>> process traces, user traces) - just not the alert.log.
>>>
>>> Of course, the analyst handling my SR ("SR 3-3591317751: missing
>>> alert.log" for those that can/like to look at such things) went off-shift
>>> sometime Friday, so I have no updates from that direction. I am
>>> cross-posting this question to the Oracle Communities (sorry for those that
>>> read this twice) but no hits there, yet.
>>>
>>> My biggest fear is that I am totally missing the most obvious thing (ie,
>>> fat-fingering the name of the alert.log I am looking at *grin*), but I feel
>>> pretty confident I double- (and triple-) checked most stupid mistakes. But
>>> one never knows....
>>>
>>>
>>> On Sun, May 15, 2011 at 09:37, Wolfgang Breitling <
>>> breitliw@xxxxxxxxxxxxx> wrote:
>>>
>>>> Any chance that the file system where the trace is is full? As you
>>>> already changed the diag dest this is not very probable. Other slight
>>>> possibility: have you checked elsewhere for the alert log, somewhere
>>>> underneath ORACLE_HOME?
>>>>
>>>> On 2011-05-15, at 7:22 AM, Charles Schultz wrote:
>>>>
>>>> > Good day, listers,
>>>> >
>>>> > Environment: Oracle EE 11.1.0.7 on Solaris 10
>>>> >
>>>> > I know, the first thing that comes to mind "Oh yeah, the
>>>> binary_dump_destination is overridden by diag_destination in 11g". That's
>>>> not the problem here.
>>>> > The next thing you think "Well, it could be an orphaned inode (ie,
>>>> deleting a file that is open by another process)". That is also not the
>>>> problem.
>>>> >
>>>> > We have an alert.log that was last updated by the database on May 6th.
>>>> Strangely enough, the log.xml in the alert directory of the diag 
>>>> destination
>>>> is being updated normally, it is just the plain text alert.log in the trace
>>>> directory that is not updated. We have bounced the database, changed the
>>>> diag_destination parameter and I have even grepped all the file descriptors
>>>> in /proc/*/fd for traces of a possibly opened alert.log - nothing, the
>>>> alert.log is still not being updated. I tried dbms_system.ksdwrt to force a
>>>> write to the alert.log - again, the log.xml is updated, the plain text is
>>>> not. My last resort was to file a case with Oracle Support, and they are
>>>> having me redo everything I have already done, even though I stated up 
>>>> front
>>>> that I did all these things already (see above).
>>>> >
>>>> > So now I have a mystery. I could pull out the Microsoft solution and
>>>> bounce the entire host. But the curiosity inside me wants to figure out 
>>>> what
>>>> is going on before I do that. What could possibly explain why the alert.log
>>>> is not being written to? It looks, smells and feels like there is an
>>>> underscore parameter that prevents writing to the alert.log. But Oracle
>>>> Support is telling me no such parameter exists (and I have not found one).
>>>> >
>>>> > Any thoughts from this collective of intelligence? :)
>>>> >
>>>> > --
>>>> > Charles Schultz
>>>>
>>>>
>>>
>>>
>>> --
>>> Charles Schultz
>>>
>>
>>
>>
>> --
>> Charles Schultz
>>
>
>
>
> --
> Howard A. Latham
>
> Sent from my Nokia N97
>
>


-- 
Charles Schultz

Other related posts: