Re: hyper visor CPU vs OS CPU

From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Mon, 1 Jul 2019 13:43:47 -0400

Hi Kyle,

If those two numbers differ, it is obvious that one of them is wrong. In other words, either the hypervisor or the guest OS have a bug. That much we know. Now, the rest is guessing. The guest machine, assuming it's a Linux box uses tools like top, sar or nmon to provide the CPU usage. All of those tools are rather old and well tested so the probability that the bug is in guest OS is rather minimal. That leaves the possibility of hypervisor incorrectly reporting the load. AWS uses home brewed hypervisor, called C5 and loosely based on KVM. There were some issues with Intel Skylake CPU. If I am allowed to guess, C5 is at fault here. Of course, to re-iterate, this is my assumption only.

There is also a well known bug of Linux performance reporting tool which report CPU waiting for memory access as "working". There two things I would suggest:

* Try all possible Linux tools: sar, top, nmon and dstat and see
   whether they all report the same thing.  If they do not report the
   same thing, see if any of them is in agreement with the hypervisor
   report.
* If not, debug the hypervisor.

Regards

On 7/1/19 12:14 PM, kyle Hailey wrote:

Anyone know what it means when the hypervisor is reporting significantly more CPU for a virtual machine than the actual virtual machine thinks it's consuming?
For the other case where virtual machine OS reports CPU is higher than the hypervisor, I always figured that it was because the virtual machine wasn't actually getting the CPU it thought it was and this could be seem with % CPU ready.
For the other way around, I'm wondering what is going on.

Thanks
Kyle

References:
- hyper visor CPU vs OS CPU
  - From: kyle Hailey

Re: hyper visor CPU vs OS CPU

Other related posts: