[THIN] Re: OT: VMWare ESX 3.x Internal / DMZ networks on same physical server

Hi Steve,

Yes but I guess I'd better explain.

I'd like to define latency as the length of time it takes between the i/o
request and actually getting data, or better yet the number of cpu cycles a
thread has to sit idle before it gets the data it's requested. If you look
at the latency of the various levels of memory, eg registers, L1-L2-L3
cache, physical memory, local disk and then SAN disk you get something like
this (adapted from Emery Berger):

Memory level   Latency (CPU cycles)
Registers          1 cycle
L1 cache           2 cycles
L2 cache           7 cycles
RAM                 100 cycles
Local Disk         40,000,000 cycles

Note that the disk latency here is an average value. It roughly accounts for
things like a SCSI disk's i/o optimization (eg elevator seeks) that improves
disk throughput but ensures you may not get the data you want right
away. About the only time we'd see anything like interface speeds is when we
were hitting data already in the on-board cache or doing a large sequential
read. If I've got multiple disks on a SCSI bus (excluding SAS for the
moment) then there's the bus overhead and device contention for the bus. But
that's okay because it takes time for disks to read and write data so
sharing the SCSI bus isn't that onerous.

And now we get to SAN disks. If I've got a single server with an HBA
connected directly to the SAN, then a SAN disk doesn't have a lot more
latency than a local disk, say about 50 million cycles because of the SAN
controller overheads. I'm assuming that the SAN is more than a glorified
RAID controller and has multiple internal SCSI busses (SCSI or SAS). Because
of the generally much larger cache and the fact that you're spreading your
i/o over a lot of disks disks, the sustainable throughput is a lot better
than local disks. So anything that needs large chunks of data, eg
SQL, is going to perform much better using a SAN and of course SAN volumes
can be so much bigger.

But now things start getting a bit complicated. If I want to attach a lot of
systems to my SAN, I've either got to spend a fortune on putting more
interfaces/controllers on my SAN, or I end up using a fibre channel switch
so I can share the SAN interface with multiple HBA's. But that means I'm
multiplexing a lot of 4 Gb (or 2 Gb) HBAs into a smaller number of 4 Gb SAN
interfaces. So my latency goes up a lot because I haven't got my own private
link into the SAN anymore and my sustainable throughput goes down.
In a typical multi-server setup we get a new latency figure:

SAN disk      200,000,000+ cycles

If I'm reading/writing small chunks of data (eg paging, using temporary
files, spooling etc) then this much latency is going to hurt my overall
performance, badly.

That's my argument.

If we think about VMs running on a "shared" SAN disk then this is part of
the reason that a VM will never be able to outperform a "real" server.

However, if SAN manufacturers start getting smart and use multiple SAS
interfaces for server connections (one external interface per server) then
it's a whole new ballgame. Of course that means much cheaper "HBAs" because
a lot of newer servers have onboard SAS (as do SANs, albeit internally) so
the whole fibrechannel "fabric" industry would disappear. SAN disks would
then be able to outperform local disks in both latency and sustainable
throughput and the extra cost just won't matter.

regards,

Rick

----
Ulrich Mack
Commander Australia

On 2/24/07, Steve Greenberg <steveg@xxxxxxxxxxxxxx> wrote:

 So are you saying that SAN boot disks suffering from latency issues? I am
trying to get a clear answer here- does your OS/Application performance
suffer as a result of pure SAN booting??



Steve Greenberg

Thin Client Computing

34522 N. Scottsdale Rd D8453

Scottsdale, AZ 85262

(602) 432-8649

www.thinclient.net

steveg@xxxxxxxxxxxxxx

Other related posts: