[SI-LIST] Re: Fibre channel interconnect margins

  • From: Alan.Hiltonnickel@xxxxxxx
  • To: Chris Cheng <Chris.Cheng@xxxxxxxxxxxx>
  • Date: Fri, 30 Jun 2006 22:57:37 -0700

You might want to re-check that. Yes, we have the same CRC checks and 
flags. But after error rates get to the "expected" or "acceptable" 
levels, most folks turn off the error reporting and let the software and 
hardware do what it's designed to do, because the system is tolerant of 
errors. That's the purpose of CRCs - to make systems more robust, not to 
make the customers help you debug your systems.
Otherwise, it sounds like you're telling us that you have invented a way 
to design equipment where randomness does not happen. Hope you got a 
patent on that! ;-)


Chris Cheng wrote On 06/30/06 22:09,:

> I don't know what kind of FCAL chip you are dealing with. Everyone I 
> have to work with have CRC detect and flags. And you bet every CRC 
> they detect will be flagged and report back. If it takes an error 
> every feel days, I will get phone calls.
> ------------------------------------------------------------------------
> *From:* Alan.Hiltonnickel@xxxxxxx [mailto:Alan.Hiltonnickel@xxxxxxx]
> *Sent:* Fri 6/30/2006 9:48 PM
> *To:* Chris Cheng
> *Cc:* si-list@xxxxxxxxxxxxx
> *Subject:* Re: [SI-LIST] Re: Fibre channel interconnect margins
> Hey Chris,
> When you started that thread I didn't answer, since it seemed pretty 
> obvious that "errors are bad", and I didn't know better.
> In fact, I think that companies DO ship products that toss a random 
> error approximately every 10e-xx or so. Why? Because the statistical 
> theory behind those errors is that random/Gaussian noise is, by 
> definition, unbounded - errors are a fact of life, even if the error 
> rate is very low. Eventually you have to an edge that is outside the 
> jitter spec. A single unrepeatable error out of billions of bits 
> simply has to be expected.
> What matters is that the system (specifically the higher layers of the 
> stack) respond to these random errors in such a way that they are not 
> necessarily catastrophic. Most serial protocols will simply resend the 
> packet. If it's truly a random error, the next transmission (or the 
> theirs) will be sent correctly. As well, many protocols also have 
> error detection and correction built in, and thus can recover in that 
> fashion.
> So sure, you should expect an error every day or so. Your system must 
> simply be able to handle that eventuality. What concerns me is that 
> these protocols are capable of masking serious bit error problems, 
> which don't become apparent until someone notices the system is really 
> lagging, or their video is starting to stutter.
> Keep in mind that we're talking about random errors. If you get an 
> error that happens every time you send a particular packet, you have a 
> faulty product. Repeating that pattern will increase the bit error 
> rate above the spec, and allow you to diagnose and fix the problem.
> As a friend of mine once said: "Randomness is too important to be left 
> to chance".
> Alan
> Chris Cheng wrote On 06/30/06 21:15,:
>>A while ago I've started a long thread about "do you really ship a product at 
>>bert 10e-xx?"
>>You seem to imply that a 10e-12 will be the benchmark for acceptance.
>>Does CISCO really ship a product that will take an error every few days as 
>>acceptable ?
>>From: si-list-bounce@xxxxxxxxxxxxx on behalf of Mcgrath, Christopher
>>Sent: Fri 6/30/2006 1:03 PM
>>To: si-list@xxxxxxxxxxxxx
>>Subject: [SI-LIST] Re: Fibre channel interconnect margins
>>Some stuff that I have done in the past to stress test FC links:
>>1. Max out the cable length defined by the FC-PI spec.
>>2. Put the product under test into a thermal chamber while varying all
>>voltages associated with the FC link (ASIC, SERDES, PHY, etc.) across
>>all corners.  (i.e. 2 voltages across hot/cold corners =3D 8 test cases)
>>3. User random data and not just the idle characters on the link.
>>Using the BER as the benchmark for acceptance (something like 10e-12),
>>these three things were the things we did to beat the hell out of links
>>before officially qualifying the physical link.
>>Our philosophy was to not use things like pre-emphasis or techniques
>>like that to stress the link but to tune the link for optimal
>>performance and reliability (best BER).  Once we established that after
>>our standard tests (#1-3 in my list) was sufficient to pass
>>interoperability standards with good margin (>2 orders of magnitude of
>>BER), we elected not to mess around with emphasis or amplitude.  The
>>only thing that we had to tune was the RX termination in the ASIC to
>>best match the board trace impedance, but this tuning was a separate
>>In summary:
>>1. Tune the link for optimal eye under lab conditions with random data
>>2. Max out the cable length while testing all thermal and voltage
>>corners with random data patterns.
>>If you meet your BER requirement across several platforms, ship it!  If
>>not, you may have to tune the eye based on the failing corner cases.
>>Oh yeah- and if you are using pluggable optics then repeat the
>>qualification for all pluggable optic model numbers that you are
>>intending to ship with the unit (or put on your interoperability table)
>>as well. =20

Alan Hilton-Nickel
Signal Integrity Engineer
Sun Microsystems Inc.
Netra Systems and Networking
Newark, CA

To unsubscribe from si-list:
si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field

or to administer your membership from a web page, go to:

For help:
si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field

List FAQ wiki page is located at:

List technical documents are available at:

List archives are viewable at:     
or at our remote archives:
Old (prior to June 6, 2001) list archives are viewable at:

Other related posts: