[ExchangeList] Re: crashing exchange 2003 server - help

  • From: Rick Boza <rickb@xxxxxxxxxxxxxxx>
  • To: exchangelist@xxxxxxxxxxxxx
  • Date: Wed, 16 Sep 2009 09:18:11 -0400

You've got a standby piece of hardware that I would stand up as a second XCHG server and move the mailboxes before I did anything else.

I'm going to guess you have a bad raid controller or moboard at this point.  I'd probably look at replacing one or both of those while hosting the mail on the temporary server.  Performance may degrade a bit depending on how underspecced it is, but much less so than the server dropping dead on a regular basis.

Rick

On Wed, Sep 16, 2009 at 7:09 AM, Exchange Mailing List <ExchangeMailingList@xxxxxxxxxx> wrote:
http://www.msexchange.org
-------------------------------------------------------Renew the hardware....

-----Original Message-----
From: exchangelist-bounce@xxxxxxxxxxxxx [mailto:exchangelist-bounce@xxxxxxxxxxxxx] On Behalf Of Harondel J. Sibble
Sent: Tuesday, September 15, 2009 11:07 PM
To: exchangelist@xxxxxxxxxxxxx
Subject: [ExchangeList] crashing exchange 2003 server - help

http://www.msexchange.org
-------------------------------------------------------Okay, small office with 4 servers

Problem machine is Windows 2003 R2 SP2, running Exchange 2003 SP2, it is a DC
and a GC.

There is 1 member server running win2k3 and SQL 2000 (Raiser's Edge), a
Windows 2003 R2 SP2 secondary DC doing backups, file and printer sharing and
a Windows 2000 DC that does roaming profiles, some printer sharing and hosts
the Quickbooks server database. It replicates with the other DC's but is not
listed in the DNS entries handed to the clients via dhcp. It's a shadow DC so
to speak.

The 2 new DC's running Win2k3 are identical hardware and I have a 3rd
identical unused server handy if I need to deploy it.

There are about 30 XP Pro desktops with a Sonicwall TZ170 firewall, at any
one time there are maybe 4-5 people logged in via VPN and maybe a handfull
more using OWA

Savce 10.x is run across the lan including on the exchange server, it is
configured on the machine to exclude AD and Exchange directories.

PopCon is used to download emails to all the accounts from the ISP's
mailboxes where spam and virus filtering is done.

Probably some of this is irrelevant but want to make sure the full picture is
available for suggestions on what's causing the problem.

I've got an IPMI card on order but that's a couple of days away.

History, back in May or so, the Exchange server stared BSOD'ing with

hardware malfunction - wmi parity check/memory parity error system has halted

I swapped out the memory with the spare system after conferring with the
server manufacturer and doing some googling on errors in the event log.
After swapping ram, problem went away until last week friday.  The server
manufacturer advised all diagnostics on the memory came back as clean with no
issues whatsoever.

Back when this problem first occurred looking through the bug checks listed
in event viewer pointed me to issues with the ntfs.sys driver, specifically

http://support.microsoft.com/kb/315223
http://support.microsoft.com/kb/937455/en-us

Being that problem had gone away with the memory swap, no other action was
taken at the time. After getting this problem on friday, I updated to the
latest 3ware raid driver from windows updates from Mar 2009 and then when the
server crashed on Sunday night, installed the Hotfix in the second URL on
monday morning.

Server crashed again last night around 8pm, had onsite staff reboot it at
0823 this morning, on login it shows unexpected restart with errors

The reason supplied by user ....................... for the last unexpected
shutdown of this computer is: System Failure: Stop error
 Reason Code: 0x805000f
 Bug ID:
 Bugcheck String: 0x00000024 (0x00190345, 0xf3e705c8, 0xf3e702c4, 0xf4fd7da6)
 Comment: 0x00000024 (0x00190345, 0xf3e705c8, 0xf3e702c4, 0xf4fd7da6)

Reading around the stop error again points to the ntfs.sys driver.. ugh

at 0906 it went sideways again, the onsite admin arrived shortly after that
and noted that on reboot it it was stuck at the bios screen with cmos
checksum error, called server manufacturer since they'd admonished never
touching the bios without first calling them since it's performance
customized. They say continue accepting bios defaults, this led to no
operating system found message, another reboot and it comes around just fine.
I had her run chkdsk /f on all partitions. It comes up fine and all is good.

Around 1300 I get a call that all the users are getting a flood of old
emails, checking the popcon download queue, it's obvious that the database it
uses to keep track of messages it's already downloaded has gone sideways and
it's downloading everything all over again.

In the end I realize I have to go through each account one by one and control
the download and processing of mail. At around 1820 while I am still working
on the email accounts, it goes sideways again, I get someone onsite to reboot
it, server is up for maybe 20 minutes then goes sideways again.

I'll have to go onsite shortly to reboot it. :-(

Event logs show NOTHING leading up to the lockups, just regular normal
messages from the system, no errors, warnings etc. The first errors you get
are after rebooting and it puts the unexpected shutdown info into the system
event log.

So.... what do ya'll suggest as next step?


--
Harondel J. Sibble
Sibble Computer Consulting
Creating Solutions for the small and medium business computer user.
help@xxxxxxxxx (use pgp keyid 0x3AD5C11D) http://www.pdscc.com
(604) 739-3709 (voice)

-------------------------------------------------------
List Archives: http://www.freelists.org/archives/exchangelist/
MSExchange Newsletter: http://www.msexchange.org/pages/newsletter.asp
MSExchange Articles and Tutorials: http://www.msexchange.org/articles_tutorials/
MSExchange Blogs: http://blogs.msexchange.org/
-------------------------------------------------------
Visit TechGenix.com for more information about our other sites:
http://www.techgenix.com
-------------------------------------------------------
To unsubscribe visit http://www.msexchange.org/pages/exchangelist.asp
Report abuse to listadmin@xxxxxxxxxxxxxx

-------------------------------------------------------
List Archives: http://www.freelists.org/archives/exchangelist/
MSExchange Newsletter: http://www.msexchange.org/pages/newsletter.asp
MSExchange Articles and Tutorials: http://www.msexchange.org/articles_tutorials/
MSExchange Blogs: http://blogs.msexchange.org/
-------------------------------------------------------
Visit TechGenix.com for more information about our other sites:
http://www.techgenix.com
-------------------------------------------------------
To unsubscribe visit http://www.msexchange.org/pages/exchangelist.asp
Report abuse to listadmin@xxxxxxxxxxxxxx


Other related posts: