Re: RAC interconnect packet size??

Hi Mark (and Joe!),

In the "wait and see" mode, you might want to track the "fragments dropped
after timeout" and "packet reassemblies failed" from "netstat -s" (Partial
list on a Linux box below). Assuming you did restart the Servers after
implementing Jumbo Frames, you should have a very small percentage from
these two stats in comparison to the total number of packets.
(Unfortunately, there isn't an equivalent to the snapshots of perf data from
netstat unless you code that with a shell script):

$ netstat -s
Ip:
   1515397615 total packets received
   0 forwarded
   21 with unknown protocol
   0 incoming packets discarded
   1515384318 incoming packets delivered
   1960954465 requests sent out
   8 fragments dropped after timeout
   26185 reassemblies required
   13057 packets reassembled ok
   8 packet reassembles failed
   13116 fragments received ok

And of course, you should track %sys time in case you were
collecting/storing sar stats.

> I hear what you're saying, but, because the LMS processes were by far the
biggest CPU hogs, I was thinking that the overhead of breaking down and
reassembling packets was the primary cause of CPU starvation.
>
> As I said, we're currently in "wait and see" mode, hoping that we've seen
the last of these events.  Obviously, if I see more CPU starvation, I'll
have to re-think the root cause.  But, as I mentioned before, enabling jumbo
frames is the "right" thing to do, and there's really no downside, so....
>

Also, keep in mind that interconnect traffic will consist of both data
blocks (larger ones that would have required reassembly depending on MTU
size) as well as smaller (~200 bytes?) messages. You should be able to see
this from AWR stats

Global Cache Load Profile
~~~~~~~~~~~~~~~~~~~~~~~~~                  Per Second       Per Transaction
                                     ---------------       ---------------
 Global Cache blocks received:                259.93                  2.78
   Global Cache blocks served:              1,084.36                 11.58
    GCS/GES messages received:              8,040.38                 85.88
        GCS/GES messages sent:              3,771.97                 40.29
           DBWR Fusion writes:                  6.40                  0.07
Estd Interconnect traffic (KB)             13,061.40

As well, you should also track the "Global Cache and Enqueue Services -
Workload Characteristics" and "Global Cache and Enqueue Services - Messaging
Statistics" sections as well in AWR. If you have AWR data from before the
change, that *may* show you if you did improve and by how much....

Would appreciate your posting any stats and observations you find...

-- 
John Kanagaraj <><
http://www.linkedin.com/in/johnkanagaraj
http://jkanagaraj.wordpress.com (Sorry - not an Oracle blog!)
** The opinions and facts contained in this message are entirely mine and do
not reflect those of my employer or customers **

Other related posts: