[SI-LIST] timing analysis

Mort,
Great posting!  Well articulated.  I've been thinking about it all 
weekend.  Over the years, I've had the chance to be involved in many 
similar discussions at Cray Research, DEC, and Supercomputer Systems. They 
go straight to the heart of what we do as engineers.

I did timing for the 60x bus a few years ago.  Our philosophy was very 
similar to yours, but we didn't get much push-back on the margin.  One of 
the first things I would ask is whether or not the 100 MHz bus is a 
performance bottleneck.  You could go through a lot of work to make the 
bus run faster only to find out that it didn't buy you anything in system 
performance.

Let's list your statistically independent variables and the underlying 
process variables they represent:

1. Driver delay
    a. driver chip threshold voltage (gate oxide thickness, doping, 
dielectric constant)
    b. driver chip effective gate length

2. Receiver setup time
    a. receiver chip threshold voltage (gate oxide thickness, doping, 
dielectric constant)
    b. receiver chip effective gate length

3. Receiver thresholds - track with setup time but don't vary much

4. Transmission line impedance & delay
    a. line width
    b. dielectric thickness
    c. line thickness (to some degree)
    d. dielectric constant

6. Clock chip pin-to-pin skew - mostly on-chip and package routing, not 
process variation

So you have three main sets of statistically independent variables: driver 
chip, receiver chip, and transmission line.  I think you would be 
justified in taking a root-sum-square of the delay deviations because they 
are independent, but only the deviations, not the actual delay numbers 
themselves.  I'm not sure this would buy you 15 MHz. though, and it could 
get squirrely if you have a reflection close to a threshold.  You are 
correct to question this method and to point back to the physics.  As you 
mentioned, you have not yet accounted for crosstalk - or load capacitance 
on the clock input pins.

I find Vih and Vil to be a large source of conservatism on the part of the 
chip suppliers.  Threshold windows, as defined by receiver unity gain 
points, are typically much smaller than specifications would lead you to 
believe.  And how about dc and mid-frequency supply voltage variations? Do 
you really need the 5% or 10% silicon designers typically use for on-chip 
static timing analysis?

If you're really motivated to solve this, you could ask your silicon 
vendors for a history of their process monitors.  (IBM is probably one of 
them.)  It they are mature processes (which they probably are), they might 
have them dialed in.  Of course, they always have the right to ship you 
corner silicon without telling you according to most contracts.

Ultimately, it's a statement of risk vs. performance.  You don't have the 
machinery in place to do a complete Monte Carlo analysis using the process 
variables I outlined above.  I find it's best to put all the cards on the 
table and make the decision as a team.  Then if something blows up, you're 
all in it together!  No finger-pointing.

Greg Edlund
Senior Engineer
Signal Integrity and System Timing
IBM Systems & Technology Group
3605 Hwy. 52 N  Bldg 050-2
Rochester, MN 55901



Hello all,

I have a question about what other people do for timing analysis
methodology.

My example is a simple synchronous 100 MHz PowerPC 60X bus.  The bus has
three devices on it (CPU and two FPGAs).  Hundreds of boards have been
built and they all run great, even at temperature extremes.  Now, after
the fact (I'm also fixing our process so in the future it won't be
"after the fact"), I am doing a worst-case timing analysis on this bus,
and it shows negative setup timing margin at 100 MHz.  This has
rekindled a debate at our company.  One camp says we should play it safe
and not run the bus faster than our worst-case timing analysis says.
The other camp says that if we aren't "aggressive" in running faster
than our worst-case timing analysis says, then our products won't be
competitive performance-wise.  I'm wondering what other people are doing
when faced with this situation.

As background, let me describe my worst-case timing analysis methodology
for determining setup timing margin.  I take the max clock-to-out time
spec from the driver data sheet, and the max input setup time spec from
the receiver data sheet.  I get the interconnect delay from a HyperLynx
simulation using slow-weak IBIS.  I am properly subtracting out the
delay from running a separate reference load simulation, which HyperLynx
refers to as "flight-time compensation".  I make sure that, in my
receiver IBIS file, vinl and vinh match the worst-case data sheet
values.  I make sure that, in my driver IBIS file, the reference load
matches the data sheet reference load.  For my PCB stackup, I adjust the
dielectric constant to be 10% greater than nominal, and I adjust the
characteristic impedance to be 10% lower than nominal (we are using
controlled impedance), since I expect this should be worst-case from a
setup time perspective.  I use the max output skew and jitter specs from
the clock driver data sheet.  I also simulate the clock traces in
HyperLynx to get the delays, in order to get the skew there.  Whatever
clock driver IBIS setting (fast-strong, slow-weak, typical) I use to
simulate one clock trace, I also use that to simulate the other clock
trace, since both clock traces are driven by the same clock
driver...i.e. same PVT.  I take all the resulting delays (driver
clock-to-out, interconnect delay, receiver setup, clock skew, etc.) and
simply subtract them all from my timing path (which is 10 ns for 100
MHz).

As I mentioned, there is a camp in our company that says the methodology
I described above is too pessimistic.  As evidence, they point to the
fact that all our boards work at 100 MHz over temperature, even though
worst-case timing analysis says we have negative margin above 85 MHz.
They are asking me to adjust my methodology in order to show positive
"worst-case" margin at 100 MHz.

I can accept the fact that the boards pass test at 100 MHz even though
worst-case timing analysis says 85 MHz; this just means not everything
is worst-case on these particular boards.

What I don't really see, however, is how I can legitimately change the
numbers "on paper" to show other than 85 MHz as the worst-case limit.
This is what a certain group of people are asking me for, claiming that
we need to be more "aggressive" -- perhaps by taking an RMS
(root-mean-square) sum of delays or something of that nature -- instead
of subtracting from the 10 ns path a straight sum of delays like I am
doing.  Another idea is to reduce the delays by some percentage to be
less than worst-case.  (I counter with the argument that even though I
describe my timing analysis methodology as "worst-case", it actually
doesn't even consider the effects of noise -- such as from the power
distribution network or crosstalk.  So I could make my timing analysis
even more pessimistic by adding some noise margin to the data sheet
values for vinl and vinh.  And regarding an RMS sum of timing delays, I
think that's a fudge, not a valid method.)

Well anyway I'm wondering what alternative timing analysis methodologies
other people use, or how other people deal with this situation at their
companies.

Thanks very much if you have comments.

Mort


------------------------------------------------------------------
To unsubscribe from si-list:
si-list-request@xxxxxxxxxxxxx with 'unsubscribe' in the Subject field

or to administer your membership from a web page, go to:
http://www.freelists.org/webpage/si-list

For help:
si-list-request@xxxxxxxxxxxxx with 'help' in the Subject field


List technical documents are available at:
                http://www.si-list.net

List archives are viewable at:     
                http://www.freelists.org/archives/si-list
or at our remote archives:
                http://groups.yahoo.com/group/si-list/messages
Old (prior to June 6, 2001) list archives are viewable at:
                http://www.qsl.net/wb6tpu
  

Other related posts: