Re: hyper visor CPU vs OS CPU

  • From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 1 Jul 2019 13:43:47 -0400

Hi Kyle,

If those two numbers differ, it is obvious that one of them is wrong. In other words, either the hypervisor or the guest OS have a bug. That much we know. Now, the rest is guessing. The guest machine, assuming it's a Linux box uses tools like top, sar or nmon to provide the CPU usage. All of those tools are rather old and well tested so the probability that the bug is in guest OS is rather minimal. That leaves  the possibility of hypervisor incorrectly reporting the load. AWS uses home brewed hypervisor, called C5 and loosely based on KVM. There were some issues with Intel Skylake CPU. If I am allowed to guess, C5 is at fault here. Of course, to re-iterate, this is my assumption only.

There is also a well known bug of Linux performance reporting tool which report CPU waiting for memory access as "working". There two things I would suggest:

 * Try all possible Linux tools: sar, top, nmon and dstat and see
   whether they all report the same thing.  If they do not report the
   same thing, see if any of them is in agreement with the hypervisor
   report.
 * If not, debug the hypervisor.

Regards


On 7/1/19 12:14 PM, kyle Hailey wrote:


Anyone know what it means when the hypervisor is reporting significantly  more CPU for a virtual machine than the actual virtual machine thinks it's consuming?
For the other case where virtual machine OS reports CPU is higher  than the hypervisor, I always figured that it was because the virtual machine wasn't actually getting the CPU it thought it was and this could be seem with % CPU ready.
For the other way around, I'm wondering what is going on.

Thanks
Kyle

Other related posts: