RE: Server failures
- From: "Freeman, Donald" <dofreeman@xxxxxxxxxxx>
- To: "Freeman, Donald" <dofreeman@xxxxxxxxxxx>, "'Chris.Taylor@xxxxxxxxxxxxxxx'" <Chris.Taylor@xxxxxxxxxxxxxxx>, ORACLE-L <oracle-l@xxxxxxxxxxxxx>
- Date: Tue, 30 Sep 2008 09:55:40 -0400
Just to follow up, responsibility for problems is hard to assign at my
location. The application owners pay for the servers and manages the users,
the server team operates and manages the servers, and the database team
operates the database. I couldn't tell you how often we endure an outage
because of the lack of willingness to step up and at least say something when
something goes wrong. The servers are aging out and failing and everybody
waits for everybody else to take action. Everybody is paralyzed into inaction
by fear of the response, "That's not your job, mind your own business." I have
a six year old production DB server down right now that previously failed back
in June. We have servers or VM's that we could have moved it to but everybody
is pretending that its not their problem.
My DBA's get testy also when I ask them to look into something that is not
strictly their responsibility. All of us get nervous when we are clearly on
somebody else's turf. When they find something I can take it up the chain and
get something done for the benefit of all of us. I point out to my team that
when things draw to their logical conclusion and a system fails that it will be
them working around the clock to move and restore a database on Christmas Eve.
Donald Freeman
Database Administrator II
Commonwealth of Pennsylvania
Department of Health
Bureau of Information Technology
2150 Herr Street
Harrisburg, PA 17103
dofreeman@xxxxxxxxxxx<mailto:dofreeman@xxxxxxxxxxx>
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Freeman, Donald
Sent: Tuesday, September 30, 2008 9:34 AM
To: 'Chris.Taylor@xxxxxxxxxxxxxxx'; ORACLE-L
Subject: RE: Server failures
I'm sure it depends but I have access to all our database servers and review
server logs when something happens. Then I open a ticket if I find something.
I'm sure lines of authority vary widely in the field.
Donald Freeman
Database Administrator II
Commonwealth of Pennsylvania
Department of Health
Bureau of Information Technology
2150 Herr Street
Harrisburg, PA 17103
dofreeman@xxxxxxxxxxx<mailto:dofreeman@xxxxxxxxxxx>
________________________________
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of Taylor, Chris David
Sent: Tuesday, September 30, 2008 9:19 AM
To: ORACLE-L
Subject: Server failures
So how many of you are responsible for examining your database servers for
hardware/software faults when it crashes? Not the database, but the actual
machine?
We recently had a server crash that reported problems when it came back up. It
has also saved a dumpfile to be examined and it reported problems during the
POST routine.
Now I get this email from my DBA manager: (paraphrased)
"Chris,
John [pc/lan mgr] requested that we try to put our finger on what caused
MachineA to failover on Saturday. I looked through the logs extensively today
[uh huh] and couldn't find anything - can you look around too and see if you
find anything?"
-Bob"
(Obviously names changed)
Maybe I'm just in a bad mood this morning....grrrr
Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205
Office: 615-517-3355
Cell: 615-354-4799
Email: chris.taylor@xxxxxxxxxxxxxxx<mailto:chris.taylor@xxxxxxxxxxxxxxx>
- References:
- Server failures
- From: Taylor, Chris David
- RE: Server failures
- From: Freeman, Donald
Other related posts:
- » Server failures
- » RE: Server failures
- » RE: Server failures
- » RE: Server failures
- » RE: Server failures
- » RE: Server failures
- » Re: Server failures
- » RE: Server failures
- » RE: Server failures
- Server failures
- From: Taylor, Chris David
- RE: Server failures
- From: Freeman, Donald