RE: Solid state disks for Oracle?

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <kevinc@xxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 11 Mar 2006 12:41:50 -0500

Huh? Comments in line.

------------------------------------
Rightsizing, Inc.
Mark W. Farnham
President
mwf@xxxxxxxx
36 West Street
Lebanon, NH 03766-1239
tel: (603) 448-1803
------------------------------------


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Kevin Closson
Sent: Saturday, March 11, 2006 11:32 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: Solid state disks for Oracle?


 >>>
>>>Still, if you have direct attach SSD you avoid the wires and
>>>a bunch of protocol and network overhead.

>>>>SSDs are all FCP. The problem with that is if you want to have,say,

  Well, NO. I've personally tested PCI bus SSD boards with 16 GB memory and
2 onboard disk drives and battery sufficient to
  dump all the memory to each disk drive. The boards were also available for
battery only persistence at power outage SSD at a much cheaper price.
  Platypus was one company that produced them, and I'm not sure who owns
what was formerly Platypus now. None of this is theoretical. TEMP can be put
on non-persistent
  storage without compromising the Oracle recovery model.

20 or so servers getting redo action on the SSD, you have to carve
20 LUNS (find me an SSD that can present its internal storage into
20 LUNS). Then you have to get the switch zoning right and then
work out the front side cabling.  How much simpler could the
SAN gateway model be than that? You present the SSD as a single
high performance LUN to the SAN gateway, put a NAS filesystem
on it, export 20 directories, mount them on the respective nodes,
set filesystemio_options=directIO in each init.ora, relocate the
redo logs to the NFS mount and now you have <1ms redo IO for
as much redo as each of the 20 servers can generate. Unless you
can find me a system that is pushing more then, say, ~105MBs
redo...then you'd have to configure a temed/bonded NIC for that
system.

           No argument from me that your protocol is a good way to share
SSD. I believe that is what I said in the message you are
           replying to.
           The question is whether you have a lot of distinct servers on
which to share the SSD. The direct attach units are significantly
           less expensive, and come in persistent and non-persistent models.
That is unlikely to be part of the market segment you serve, so it doesn't
surprise me too much that you've never heard of them.

>>>power source of the SSD is not dependent on the machine, so
>>>you get the memory to the onboard disk drives in a server
>>>power outage, and if you have duplex onboard disks that is
>>>as good as raid 1.

Power? SSDs like the now defunct Imperial and others like DSI
are so internally power redundant it is redicilous. Like I said
early on in this thread, I'm not talking from a theoretical
standpoint. I have had an 8-port MG-5000 SSD since 2001 (not cheap,
not small).

        As I have said, there exist SSD boards that are *NOT* persistent
across a server power cycle. They are much cheaper per gig
        and they serve their purpose. But you should not put anything on
them that is part of the Oracle recovery stream unless it is
        a throwaway, totally refreshable system. Since the SSD performance
matches the persistent models, that is a way to cut the
        costs of test systems configured identically for performance as
production (but not configured for reliable recovery). If you're on
        a UPS, that means server crashes you can reboot without a power
cycle still don't require recovery or refresh, but that is not
        good enough in my opinion for the primary transaction stream.

>>>SSD. You preserve the utility of the SSD as long as network
>>>latency and bandwidth is sufficient for the load.

>>>>>>GigE is quite sufficient for transaction logging. You can

    Pick a large enough number of servers and that is not true. That IS
theoretical. But what I wrote was that the utility is preserved
    as long as the network latency and bandwidth is sufficient the utility
IS preserved. GigE likely does handle all reasonable cases,
    so that is an excellent choice. But don't expect SSD on a network to
help you at all if you share it across a crappy network or
    even a good network connection with little headroom. You have to buy and
spec a network attachment that is NOT a bottleneck
    compared to load, or you will diminish the utility of the SSD. You're
saying that such a network attachment exists for almost all
    cases and every case you've seen. I don't disagree, but whoever
configures the system has to get a sufficient network attachment.
    Again, as long as network latency and bandwidth is sufficient for the
load presented, the utility of the SSD is preserved.

see some test results here:
http://www.polyserve.com/pdf/Oracle_NFS_wp.pdf

>>>
>>>Which is best for a given server farm will vary. Even if you
>>>use direct attach SSD, you still have to verify that the SSD
>>>is certified for the protocol it is emulating to be treated

Emulation? These devices are FCP. And who would "certify" such
a thing?

   The certification on the PCI bus was for a given manufacturer to issue
all commands in the regression suite for verifying a new
   model of disk drive for use in their machines. The whole point of those
models of SSD was you could use them as if they were disk drives (with
extraordinarily short
   seek times and high bandwidth.) Different manufacturers certified them
for different buses, and you had to buy a model of the boards that was
certified for a particular manufacturer and bus.
--
//www.freelists.org/webpage/oracle-l



--
//www.freelists.org/webpage/oracle-l


Other related posts: