[SI-LIST] Re: Comments on "Do you really ship products at BER 10e -xx ?"

From: Vinu Arumugham <vinu@xxxxxxxxx>
To: Chris.Cheng@xxxxxxxxxxxx
Date: Mon, 09 May 2005 10:05:44 -0700
Signal edge rate is one contributor to RJ. Going from 2.5Gbps to 10Gbps, 
one can expect better RJ due to faster edge rates.

Thanks,
Vinu

Chris Cheng wrote:

>Charles,
>I don't have an answer for that. But you can do the math, a 2ps rms Rj will
>translate to 2ps x 14.1 = 28.2ps BER at 10e-12. Assuming a fully utilzed
>10Gb/s or beyond net with high percentage of error checked payload. That
>28.2ps is a very high percentage of the cycle time and error rate of 10e-12
>means detectable errors in the order of a few an hour. I don't think any
>company dares to tell that to their customer.
>At least not with the kind of IT shops I have to deal with.
>
>-----Original Message-----
>From: Grasso, Charles
>To: 'Chris Cheng '; 'si-list-bounce@xxxxxxxxxxxxx '; ''Mike Williams ' ';
>''si-list@xxxxxxxxxxxxx ' '
>Sent: 5/8/2005 8:18 AM
>Subject: RE: [SI-LIST] Re: Comments on "Do you really ship products at BER
>10e -xx ?"
>
> Hi Chris - Is 2Ps of Rj realistic. I mean - have we reached the
>theoretical limits of our harware and have to jump to light
>transmission on a pwb??
>
>-----Original Message-----
>From: si-list-bounce@xxxxxxxxxxxxx
>To: 'Mike Williams '; 'si-list@xxxxxxxxxxxxx '
>Sent: 5/8/2005 12:14 AM
>Subject: [SI-LIST] Re: Comments on "Do you really ship products at BER
>10e
>-xx ?"
>
>Mike,
>Welcome to the list.
>It is good to hear from you and Art from the vendor point of view. I am
>particular intrigued by Art's reference paper on how different
>instruments
>starts to diverge under strong Dj which is what I believe is the true
>challenge in characterizing real running system rather than some PLL in
>a
>test fixture. 
>I also agree that it is not the instrument vendor's fault but rather it
>is a
>spec imposed by certain committee that without care and fully understood
>by
>engineers can extrapolate to unrealistic pessimism. I learn a lot from
>both
>on and off line discussion on this subject. 
>However, one offline comment from a good friend still haunts me. At
>~2-4Gb/s, a Rj of 2ps is still manageable, but what will life be if we
>go
>into 10Gb/s without PAM @ 2ps Rj ? I don't have the answer but my
>suspicion
>is half the industry will probably look the other way and come up with
>another spec that conveniently spec it to work. This is like tester
>guardband in the 90's. Everyone insist they have to triple or quadruple
>tester guard band in timing equations for awhile and somehow after the
>turn
>of the millennium when cycle time getting smaller and smaller that
>guardband
>just shrunk to a negligible amount. 
>But at certain point of time (say 10Gb/s), that 2ps of Rj (if it is
>truly
>unbounded and gaussian) I conveniently ignore may come back and bite me.
>And
>I cannot envision myself, nor my management will allow, to ship a
>product
>that flags a CRC error every 10e-12 BER. I am not talking about a BERT
>analyzer extrapolating a bathtub curve, I am talking about real CRC
>error
>flags every few days that the service engineer will see. But that's
>exactly
>some of these standards allow. On the other hand, if we have to make
>management really happy, we are really designing 10e-18 or 10e-20 like
>some
>response to the thread mention. That's a really long Rj tail ! Do we
>really
>have unbounded jitters that will grow to infinity given enough time ?
>
>Good discussions, hopefully it can continue.
>Chris
>
>
>-----Original Message-----
>From: Mike Williams
>To: si-list@xxxxxxxxxxxxx
>Sent: 5/5/2005 9:54 PM
>Subject: [SI-LIST] Comments on "Do you really ship products at BER
>10e-xx ?"
>
>  
>[SI-LIST] Do you really ship products at BER 10e-xx ?
>
>*      From: Chris Cheng <Chris.Cheng@xxxxxxxxxxxx> 
>*      To: si-list@xxxxxxxxxxxxx 
>*      Date: Tue, 12 Apr 2005 13:49:09 -0700 
>
>I've been shipping Gb/s serial products for a while and have my share of
>fail parts. However, I have yet to see a physical channel that is not
>either
>working like a charm or just fall on its face and barfing errors like
>crazy.
>Sure, chips or disk can fail and generates errors but no flaky channels
>that
>spits an error every other hour or days. To me, the channel is either
>have a
>BER that is near 1 (barfing errors like crazy) or near 0 (never fail, or
>at
>least approaching the life of the product it is attached to). Are we
>just
>kidding ourselves with these fancy BER analyzers or jitter
>instruments ? Do you really let a machine runs at say BER 10e-12 and say
>"ah
>ha, it only fails once a day and let's ship it" ? Is BER really meant
>for
>IEEE spec committees and not for real engineers who actually have to
>ship a
>product ?
>
>
> 
>
>Pasted from
><//www.freelists.org/archives/si-list/04-2005/msg00131.html> 
>
> 
>
> 
>
> 
>
> 
>
>Hi Chris,
>
> 
>
>I'm not a regular participant in this list. A colleague forwarded your
>post
>on to me as part of an ongoing discussion we have been holding about the
>legitimacy (or actually lack thereof) of the thought models in popular
>use
>around understanding jitter in serial data signals, and another brought
>the
>SI list itself to my attention recently. I wrote the following a couple
>of
>weeks ago but absent-mindedly forgot to post it. Let's see if this still
>works.
>
> 
>
>The question you pose is a great one, and it gets to the core of a
>matter I
>am very close to.. I am simultaneously a long-time harsh critic of
>statistical jitter decomposition (Rj/Dj/.), I also own a company that
>makes
>a widely used set of statistical jitter decomposition tools and in that
>capacity, I invest a large part of our R&D dollars and engineering
>bandwidth
>in expanding our understanding of this kind of analysis. While those
>don't
>sound like they go together, I find that to be effective in this
>situation,
>one has to accept that there are two contradictory yet essential
>philosophies in play. 
>
> 
>
>Regarding your questions, I agree intensely with what is implied by
>them..
>that to a mind trained to reason as an engineer, the current approaches
>which engineers are being told to employ raise more questions than they
>answer. Statistical jitter decomposition (I'll use SJD from here to save
>typing) is one of many ways of abstracting the timing behavior of a
>signal.
>In the span of just a few years, it has moved from near-total obscurity
>among practicing engineers to being pervasive and the virtually
>unchallenged
>approach to how one must measure jitter in serial data links. SJD has
>quickly achieved the status of an entrenched orthodoxy despite the
>existence
>of a formidable body of reasons to question its validity even as a
>general
>approach. 
>
> 
>
>That list of concerns is just too expansive to give any sort of detailed
>treatment in a casual exchange like this. I have my own little personal
>taxonomy I use just to keep all the issues organized:
>
> 
>
>Theoretical Issues/Concerns - which center around Gaussian/Central-Limit
>abuse, misuse/misunderstanding of how Nyquist applies in the modulation
>domain, the fact that the "standard buckets" in the Rj/Dj/. taxonomy
>don't
>always hold up even in common situations, and the appropriateness of how
>SJD
>is employed in abstracting important common pathologies with dynamics
>that
>just can't be seen or represented using that method. 
>
> 
>
>Practical Issues/Concerns - The "mathematical machinery" employed to
>grind
>these results out can instill large and unpredictable effects in the
>final
>numbers, the suitability of the BER "thought model", HUGE correlation
>issues
>and significant and uncontrolled (at the spec level) implementation
>differences that exist between one solution and another, and the fact
>that
>it's a moving target. this "stuff" is being made up as the industry goes
>along. 
>
> 
>
>Philosophical Concerns - A measurement problem being turned into a
>combination political-math problem, the unfortunate everyday necessity
>to
>balance concerns around methodological validity with the reality that
>"everything requires SJD" to get out the door, abstraction and
>intransparence (the "you don't know what you don't know" effect") among
>the
>implementers and spec-makers, validation of methods and the range of
>pitfalls built into that as well. 
>
> 
>
> 
>
>Again, this is a long list and my intent is not to do a deep dive here
>but
>to frame up both agreement with your observation as well as explain one
>perspective on navigating the morass. Under the most commonly cited
>document
>that blesses these things, there are numerous platforms and algorithmic
>approaches listed resulting in a cross-product of perhaps 30 different
>ways
>to do it, and that figure is further elevated by the fact that numerous
>implementers each have their own method that purports to derive from one
>of
>those blessed combinations. Under a broad range of applied jitter types,
>each of those approaches differ, often dramatically from each other.
>Different answers. Different convergence/divergence behaviors..
>Differing
>abilities to even see common pathologies (e.g. "BER blooming")..
>Different
>repeatabilities and accuracies. So here comes the first of many clues
>that
>illustrate the sanity rating of the current SJD mindset in the industry.
>THEY ALL ARE "RIGHT". As un-engineering-like as it sounds, they're all
>"blessed", so go ahead and pick the one that gets your parts out the
>door.
>I'm not representing that as the kind of engineering I advocate, but
>"the
>engineering cops" couldn't write you a ticket for operating in that
>mode. 
>
> 
>
>To get a little bit of the flavor of the problems built in to where SJD
>is
>today and where it seems to be headed, let me dive in deeper on just one
>of
>the points raised above. the implementation mechanics of SJD. Being in
>clocks and their measurement for 25 years, I have been a close observer
>of
>the SJD trend going back to when it started (I should actually say
>"restarted".. you can go back to at least the 50's in the active
>engineering
>literature with sort of a renaissance having taken place in that
>literature
>in the late 70's and early 80's). As noted, we have spent a great deal
>of
>time and energy studying the mechanics of the various suggested
>Rj/Dj/Tj/Pj/. for several years now. All of the suggested methods "work"
>at
>the level of a quickie whiteboard lecture. you can see how at the big
>picture level the results look like they should be what is sought. At
>this
>high level, they make sense. As you dig in deeper, you find that there
>is a
>fascinating range of "gotchas". a vast range of behaviors and effects
>that
>can have significant unanticipated impact on the final result. An entire
>layer of impact completely unaddressed by the spec-makers as well as the
>vast majority of solution providers. It's virtually not on the radar
>screen
>at all other than in a few very small pockets at some of the larger
>customer-side companies. Of all of the issues in the lists above, to my
>eye,
>this has proven by far to be the broadest. Some of the uncontrolled
>effects
>of SJD mechanics to which I am referring are:
>
> 
>
>1.     Complex/unanticipated interactions between the signal dynamics
>and
>algorithm's mechanics
>2.     Convergence/divergence effects
>3.     Misrepresentation due to "measure-predict cycling"
>4.     Encountering jitter behaviors not anticipated by your SJD
>implementation
>5.     Embedded unanticipated behaviors, and the impact of implicit
>assumptions
>
> 
>
>One example from a lecture I gave on this at one of our customers
>recently
>that seemed to illustrate it well for them was this. Consider that in
>many
>approaches, you are building up your model of overall system timing from
>the
>rarest events seen. For example, you might have many millions of events
>in a
>measurement population but the curve-fit process only applies maybe a
>thousand points to the actual result. The consequence is that even a
>small
>change entering in to the rare-event population (i.e. finally "seeing"
>something that fills in the tail a little better) can have a stark
>impact on
>the BER estimate and dynamics. 
>
> 
>
> 
>
>The "issues and concerns" side is enormous, and while we only talked
>about
>the tips of the icebergs, you hopefully at least get the flavor. Let's
>shift
>gears now because there IS another important side to this. That is, some
>engineer, some place out there got to work this morning and a piece of
>paper
>told him he HAD to measure Rj/Dj/.. on his part in order to get it out
>the
>door. His reality. many engineer's reality.. is that SJD as it is now
>imagined is a very real part of life in the lab. For the foreseeable
>future,
>that's all that really matters in their orbit. 
>
> 
>
>As a resource on timing and timing measurement for our clients and
>partners,
>we can and do try to refine the "why" behind our criticism of the
>appropriateness of SJD as an abstraction of the realistic kinds of
>jitters
>one can reasonably encounter. However, as a solution provider, we have
>to
>start with a different philosophy.. that it's accepted by the industry,
>and
>that we have to provide a product that addresses as many of the gotchas
>as
>possible. Your question asks whether the products are meant for real
>engineers trying to ship a product, so here's one provider's view of
>that. I
>feel that IF you are going to go down the SJD path that it IS possible
>to
>provide a method that can deliver numbers that are accurate and
>repeatable
>UNDER THE Rj/Dj/.. BELIEF SYSTEM.. without pain.. IF it is used
>properly. 
>
> 
>
>Our own approach is as follows, and I'll be brief and try to remain at a
>general product nonspecific level. We have studied, modeled and analyzed
>the
>suggested approaches for several years as a primary focus of our
>everyday
>engineering work. In this work, we have worked with outside experts in
>rare-events prediction as well as one of the individuals credited with
>having developed the mathematics that underlie decomposition in general
>back
>in the 70's. I've actually known him since that time but only discovered
>a
>few years ago that side of his career. Small world. We have identified a
>significant range of effects/mechanics built in to the suggested methods
>that will to a certainty cause problems. We have used that knowledge to
>craft an independent approach that steers clear of the known issues
>(e.g.
>accuracy, convergence, repeatability and stability effects). 
>
> 
>
>In concert with that, we built up a synthesizer that can create all of
>the
>various kinds of stationary jitter one can possibly expect to encounter.
>the
>universe of stationary jitter under the standard SJD thought model. This
>synthesis system originally broke that universe down into 15,000 regions
>but
>the current model, which pushes the edges of that universe out a bit
>further, breaks it down into just shy of 10 million regions. We use the
>synthesizer to push all 10 million flavors of jitter through an
>unmodified
>version of decomposition method we fashioned. To look at the results
>that
>emerge from that, you would definitely see that there are places where
>the
>results differ quite significantly from what was synthesized, and you
>would
>also note that they are extremely consistent and repeatable. I attribute
>this repeatability primarily to avoiding the algorithmic pitfalls
>referred
>to above. 
>
> 
>
>The next step is that we then submit both the expected and actual
>results to
>a neural network-based system of our own design that attempts to
>calibrate
>the difference between expected and actual to as small a value as we can
>make it over the vast majority of the error surface. In reality, we
>don't
>just rely on the neural approach because after staring at the underlying
>mechanics for so many years, we have some engineers that are pretty good
>at
>recognizing ways to improve the error that exceed what the neural
>approach
>can do on its own. In the end it's iterative. let the network create a
>calibration scheme. study and tweak it (i.e. tweak how the network
>operates)
>and then grind away some more. The process is constantly revealing new
>insights. Most recently, we've found some dependencies on jitter
>behaviors
>that can be further improved by moving from static to dynamic
>calibration.
>That is. not just creating a huge cal scheme that is built into the
>product
>when it ships, but which also does some dynamic calibration as it runs.
>I
>would say that the improvements attributable to that will be less felt
>in
>the sort of repetitive short patterns that seem common now among our
>customers, and will make an observable difference especially on live
>data
>and a special impact on data that has moderate to significant ISI. It
>all
>counts. 
>
> 
>
>The calibration addresses something really obvious, but which is not
>even
>considered at the spec level. It's unnecessary. any of the blessed
>methods
>are fine under the spec. What it means for the "real engineer trying to
>get
>parts out the door" is that if the measured number is too high, a result
>from a calibrated and fully validated system is significantly more
>likely to
>indicate a part really is doing something undesirable rather than an
>over-representation by the SJD. 
>
> 
>
>There are other things that can be set in the SJD mechanics/process, but
>which there is no rational reason to set them up one way versus another.
>For
>example, some engineers want to see when BER blooming occurs. In even
>very
>expensive and stable sources, BER blooming does occur and it's more
>common
>than you might think in even well-fixtured devices, though many SJD
>schemes
>can't see that kind of dynamicism. It's useful, I suppose to an engineer
>in
>debug mode, but we also see engineers who want it "blended out" which is
>more rational than it may seem on the surface since the way the
>rare-events
>math reacts to the blooming (sharply at times, mentioned above) can be
>distracting. So since the specs go no where near this level of
>consideration
>yet it impacts the usability of the tool, you have to give the choice to
>the
>operator and pick a default state that is guaranteed to tick half of the
>people who use it off. Lots of design decisions fall into this category
>of
>having to make choices that the specs should have made for us. 
>
> 
>
>So. are the tools for the spec-makers or are they really for engineers
>that
>have to ship a product? My opinion is that the spec-makers have too much
>influence over what ultimately appear in the tools, and not nearly
>enough
>skin in the game in terms of having to use what they specify. There are
>more
>marketing titles and commercially driven interests on these committees
>than
>you'd probably want to see, and so you get what you get. But keep in
>mind,
>in the end, your company and your industry sector decided to accept
>whatever
>specs you are forced to treat as the gospel. That makes it a tool for
>those
>engineers I guess. But not the only tool either. When they don't hit
>their
>numbers, they will need additional tools too. SJD is in the same class
>as
>dice when it comes to debug utility (note to self. file for patents on
>Rj/Dj
>dice). Good debug tools (I contend that nothing can match the modulation
>domain for doing effective timing debug) will also help you understand
>the
>reason your parts are not hitting their number.. is it the tool or the
>device? And that's actually where using a validated approach can save a
>bunch of time as well. 
>
> 
>
>The industry is presently in a very strange place.. a place that is hard
>to
>defend. We accept at face value things that provide almost boundless
>reasons
>to pursue other choices, but revert to believing what we're told.. it's
>in
>the spec so it must be true. The result of this being the significant
>yet
>unnecessary daily frustrations that follow from the bad choice. Yet.. it
>is
>incredibly uncommon to see the bad choices actually identified and
>challenged as the root of why their frustrations in this space are
>mounting.
>A mindset that we consider crucial to being able to look at statistical
>jitter decomposition objectively is a refusal to accept as
>unquestionable
>that Rj/Dj separation is correct.. and that the analysis of timing in
>serial
>data streams is a leadership problem.. a philosophy problem.. an
>epistemological problem. and most definitely a measurement problem. But
>it's
>not a math problem. If someone's solution involves invoking yet one more
>thing they learned in stochastic processes, they're part of the problem,
>not
>part of the solution, in my considered opinion. What we have is broken
>and
>better ways are needed. The solution seems obvious, but also, the "they
>don't know what they don't know" force is strong. I'm sure better
>methods
>will materialize quite quickly as soon as enough people are willing to
>openly articulate what the emperor is wearing. 
>
> 
>
>If you're interested, I've written a few papers about these matters.. "A
>Discussion of Rj/Dj/.. Compliance Measurements", and another on the work
>we've done over the last year w.r.t. calibrating the error surface for
>each
>of the various jitter terms of our own Rj/Dj/.. tool set to within 2ps
>or 5%
>over the universe of stationary jitter. The second one will be revised
>in a
>few weeks to include the expanded jitter space and dynamic calibration
>but
>the original still covers the core of what we have done in this area
>pretty
>well. Just send an email to "info AT amherst-systems DOT com" and ask
>for
>the "Rj/Dj paper" and/or the "validation paper" and someone will get
>them
>right out to you. I'm not big on email myself. 
>
> 
>
>Chris.. I didn't have time to write something short.. I hope after all
>these
>words, I provides some further insight into the situation. Good luck. 
>
> 
>
>Best regards,
>
>Mike 
>
> 
>
>--
>
>Mike Williams
>Pres.
>ASA Corp.
>www.TheJitterSolution.com <http://www.thejittersolution.com/> 
>
>
>
>
>
>------------------------------------------------------------------
>To unsubscribe from si-list:
>si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field
>
>or to administer your membership from a web page, go to:
>//www.freelists.org/webpage/si-list
>
>For help:
>si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field
>
>List FAQ wiki page is located at:
>                http://si-list.org/wiki/wiki.pl?Si-List_FAQ
>
>List technical documents are available at:
>                http://www.si-list.org
>
>List archives are viewable at:     
>               //www.freelists.org/archives/si-list
>or at our remote archives:
>               http://groups.yahoo.com/group/si-list/messages
>Old (prior to June 6, 2001) list archives are viewable at:
>               http://www.qsl.net/wb6tpu
>  
>------------------------------------------------------------------
>To unsubscribe from si-list:
>si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field
>
>or to administer your membership from a web page, go to:
>//www.freelists.org/webpage/si-list
>
>For help:
>si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field
>
>List FAQ wiki page is located at:
>                http://si-list.org/wiki/wiki.pl?Si-List_FAQ
>
>List technical documents are available at:
>                http://www.si-list.org
>
>List archives are viewable at:     
>               //www.freelists.org/archives/si-list
>or at our remote archives:
>               http://groups.yahoo.com/group/si-list/messages
>Old (prior to June 6, 2001) list archives are viewable at:
>               http://www.qsl.net/wb6tpu
>  
>------------------------------------------------------------------
>To unsubscribe from si-list:
>si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field
>
>or to administer your membership from a web page, go to:
>//www.freelists.org/webpage/si-list
>
>For help:
>si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field
>
>List FAQ wiki page is located at:
>                http://si-list.org/wiki/wiki.pl?Si-List_FAQ
>
>List technical documents are available at:
>                http://www.si-list.org
>
>List archives are viewable at:     
>               //www.freelists.org/archives/si-list
>or at our remote archives:
>               http://groups.yahoo.com/group/si-list/messages
>Old (prior to June 6, 2001) list archives are viewable at:
>               http://www.qsl.net/wb6tpu
>  
>
>  
>

------------------------------------------------------------------
To unsubscribe from si-list:
si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field

or to administer your membership from a web page, go to:
//www.freelists.org/webpage/si-list

For help:
si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field

List FAQ wiki page is located at:
                http://si-list.org/wiki/wiki.pl?Si-List_FAQ

List technical documents are available at:
                http://www.si-list.org

List archives are viewable at:     
                //www.freelists.org/archives/si-list
or at our remote archives:
                http://groups.yahoo.com/group/si-list/messages
Old (prior to June 6, 2001) list archives are viewable at:
                http://www.qsl.net/wb6tpu
References:
- [SI-LIST] Re: Comments on "Do you really ship products at BER 10e -xx ?"
  - From: Chris Cheng
[SI-LIST] Re: Comments on "Do you really ship products at BER 10e -xx ?"

Other related posts: