[foxboro] Weird System Monitor Error

List,

    We had a P92 running a System Monitor and the System Monitor was 
monitoring the P92 itself.  (This sounds like a riddle and I hope 
somebody can solve it).  We decided to use a System Monitor on a P91 to 
monitor the P92, (cross monitor), so we changed System Definition and 
moved the P92 to be monitored by the System Monitor of the P91.  We 
created a commit disk and reinstalled System Monitor and System 
Management Display Handler packages on the P91.  As expected, the P92 
showed up in two places on the SMDH screens.  Once on the P91's System 
Monitor, and once on the P92 System Monitor but the P92 continued to 
actively monitor itself and the P91's System Monitor showed the P92 as 
RED failed.  Using  the same commit disk, we then re-installed SYSMON 
and SMDH on the P92 and when we rebooted, the P92 no longer was 
monitored by itself and the P91 System Monitor was correctly monitoring 
the P92.  This sounds like a successful story but this is where our 
trouble started.
    Immediately after the System Monitor re-install on the P92, we 
rebooted the P92, and one of ten  FT FCP270's being monitored by the P92 
System Monitor started reporting cable errors once every two minutes.  
Deeper investigation showed that it was an REDL Port B cable error.  The 
error would occur, stay in alarm for almost exactly one minute, and then 
clear for almost exactly two minutes, and then go into alarm again.  We 
thought that it was strange that this happened immediately after 
re-installing System Monitor on the System Monitor host of the FCP and 
that it did it on only one of ten FCP's being monitored.  We decided to 
treat it as a real cable error, so we swapped out each cable, (one by 
one), between port B of each of the two FT FCP's and the splitter 
combiner.  We swapped out the splitter combiner, we swapped out the 
cable between the splitter combiner and the "B" MESH Switch it was 
connected to.  BTW the MESH switch did not report a "B" port error on 
the port this FCP was connected to.  None of these efforts changed the 
way the System Monitor error was reported.  We decided there might be 
some kind of software corruption in the way the FCP was reporting the 
error and decided to reboot the FT FCP,  both FCP's at the same time, to 
see if reloading the OS and Database would keep the CP from reporting 
the REDL Port B failure.  All of our antics were to no avail and the 
errors keep repeating to this writing.
    We always suspected that something went  wrong during the System 
Monitor reinstall and that the error being reported is being 
mis-interpreted , (in error) by the system monitor since it is not a 
hard error that stays, but one that clears and then re-alarms on such a 
regular interval. 
Has anyone seen anything like this before?

Are REDL, (Redundant Ethernet Data Link), port errors generated and sent 
by the station that manifests the error?

Or are they generated by the System Monitor host as it tries to query 
the ports of the stations being monitored?

Any help or insight will be appreciated.  I am going to call TAC on this 
one but the time difference between Hawaii and FoxMass made me think 
twice yesterday afternoon when we were troubleshooting this.  I will use 
this write-up to describe our problem to them.

Cheers,
Tom VandeWater
Control Conversions, Inc.
Kapolei, HI
 
 
_______________________________________________________________________
This mailing list is neither sponsored nor endorsed by Invensys Process
Systems (formerly The Foxboro Company). Use the info you obtain here at
your own risks. Read http://www.thecassandraproject.org/disclaimer.html
 
foxboro mailing list:             http://www.freelists.org/list/foxboro
to subscribe:         mailto:foxboro-request@xxxxxxxxxxxxx?subject=join
to unsubscribe:      mailto:foxboro-request@xxxxxxxxxxxxx?subject=leave
 

Other related posts: