Re: RAC node "has a disk HB, but no network HB" but traceroute

From: Gus Spier <gus.spier@xxxxxxxxx>
To: dmarc-noreply@xxxxxxxxxxxxx
Date: Thu, 5 Jan 2017 19:49:47 -0500

http://berxblog.blogspot.com/

Is this the Martin Berger blog you referred to ??

Very coincidental that it popped up in my google feed after reading your
trials and tribulations.

Regards,
Gus

On Thu, Jan 5, 2017 at 5:08 PM, Yong Huang <dmarc-noreply@xxxxxxxxxxxxx>
wrote:

Thanks, Justin, Jure and Martin. Martin's article is great. Interpreting
"no network HB" as "there are 2 or more processes which missed to
communicate" instead of a network problem is the key. That's exactly what I
meant in the SR I opened by saying "We begin to doubt about the meaning of
the "no network HB" message". So far the SR hasn't gone anywhere after
uploading various types of logs.

Our log does show fast increase in IP packets that need reassembly and all
these reassemblies failed:
$ egrep '^zzz|reassembl' <OSWatcher netstat log>
...
zzz Sun Dec 18 02:01:58 CST 2016
555539624 reassemblies required
100653307 packets reassembled ok
60026 packet reassembles failed
zzz Sun Dec 18 02:02:28 CST 2016
555545702 reassemblies required
100653307 packets reassembled ok
66103 packet reassembles failed
zzz Sun Dec 18 02:02:58 CST 2016
555551748 reassemblies required
100653307 packets reassembled ok
72149 packet reassembles failed

Of all the documents I found, Red Hat "IP fragmentation fails and
fragmented packets get dropped" at
https://access.redhat.com/solutions/1498603
is a good one. But you have to login to read it. In short, if I understand
the confusing Root Cause section correctly, kernel-2.6.32-477.el6 or
RHEL6.6 has a bug that incorrectly calculates IP fragmentation memory,
which causes false evictions (i.e. drop) of IP fragments on systems with
many CPUs. (Our problem server has 80 CPUs. Other servers have much less.)
Upgrade of the kernel or Red Hat release version is the solution. An easy
workaround is to increase the fragmentation buffer size. The article says
doubling the fragmentation thresholds is enough, i.e. from the default 4M
to 8M. We'll set the IP fragmentation buffer low and high values to 15 and
16 MB per Oracle note 2008933.1. I think the counter "fragments dropped
after timeout" in `netstat -s' is related to /proc/sys/net/ipv4/ipfrag_time
and ours seems to be fairly stable even before the crash, I'll leave that
parameter alone for now.

Now I think I know why our OSWatcher did not report a traceroute problem
at the last crash: the default packet size used by traceroute is only 60
bytes. To detect the problem, we should append a packet length parameter to
the traceroute command with a value greater than 1500, the Ethernet MTU.

Yong Huang
--
//www.freelists.org/webpage/oracle-l

References:
- Re: RAC node "has a disk HB, but no network HB" but traceroute
  - From: Yong Huang

Re: RAC node "has a disk HB, but no network HB" but traceroute

Other related posts: