Re: Testing data usage patterns

  • From: Don Seiler <don@xxxxxxxxx>
  • To: Job Miller <jobmiller@xxxxxxxxx>, ORACLE-L <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 6 Dec 2013 11:55:46 -0600

Total size of the DB is around 23 Tb. Default buffer cache is around 500Gb,
with a 50Gb 16k buffer cache (we have one tablespace that is 16k block
size).

We have 1Tb of DRAM for ARC and around 1.8 Tb of flash for the L2ARC. I'll
take a look at an AWR report to see the logical vs physical aspect.

Based on a testing script written by one of our architects, I/O times in
production standby are in the 2-3ms range. Running that same test on the
ZFS test instance with the same data would start at 5ms with a cold cache,
then 2ms then .5ms (fetching the same data). However we're not completely
sold that our app fetches the same data over and over, hence this email.

Adding to our concern is that right now our production primary has one
diskgroup with a preferred read mirror comprised of a RAMSAN, which we put
must-be-read-fast tablespaces on. With the ZFSSA we don't seem to have that
kind of control. Issues with IBM's takeover of RAMSAN and handling of our
case (refusing replacement parts) are what sent us looking for a new
solution in the first place.

Don.


On Thu, Dec 5, 2013 at 4:34 PM, Job Miller <jobmiller@xxxxxxxxx> wrote:

> what are your read i/o service times now?
>
> What is the size of the database, size of buffer cache, and potential size
> of ARC L2ARC caches backed by flash?
>
> You'll have access to HCC (Hybrid Columnar Compression), which could
> significantly improve any query response times that do Full Table Scan
> operations on tables that are now 1/10th the size they use to be..
>
> I'm not an expert on ZFSSA, but with the hybrid storage pools, but it's
> hard to imagine a situation where the cache is such a tiny part of the size
> of your overall database and that the i/o is so random, that a large cache
> behind a ZFSSA won't significantly benefit some of the current i/o.
>
> However, if your current i/o times are all <1ms....there isn't much room
> for improvement.    That's why I was asking what your current service times
> are and along with that, what kind of IOPS do you have now?
>
> any "unique" lookups, are probably already served out of the buffer
> cache.   a single buffer holds lots of rows.   so even if you aren't
> looking up the same row, if you have 16GB of data in the buffer cache....
> how often will the data not be already cached?
>
> What is the percentage of i/o now is physical vs. logical?
>
> If 98% is logical, you are talking about caching to improve 2% of overall
> reads.
>
> An AWR has all that data, which I am sure you are familiar with...
>
> Have you used RAT before?
>
> You can estimate....
>
> If your Top 5 in AWR shows 50%+ of DB Time is scattered/sequential reads
> (physical i/o), and those i/o service times are 15ms...you can bet that a
> bigger buffer cache AND a bigger storage system cache will make a
> significant difference in end user performance.
>
> If you completely eliminate i/o waits, how much DB Time does that save
> you?   What's the upside?
> If I/O waits are only 2% of DB Time...  even if all the data is cached in
> flash, response times aren't going to materially change.
>
> Job
>
>
>   ------------------------------
>  *From:* Don Seiler <don@xxxxxxxxx>
> *To:* ORACLE-L <oracle-l@xxxxxxxxxxxxx>
> *Sent:* Thursday, December 5, 2013 4:58 PM
> *Subject:* Testing data usage patterns
>
> Good afternoon. Oracle 11.2.0.3. Currently on ASM but looking to migrate
> to the Oracle ZFSSA. The big selling point of the ZFSSA was the ARC and
> L2ARC caches backed by DRAM and flash. However we want to know if our
> application needs will be met here. We suspect that a fair amount of our
> I/O is for unique lookups that aren't repeated that often, and so likely
> would not be in the ZFSSA cache on subsequent executions.
>
> However I'd like to be able to determine for sure. I'm wondering if any
> list members have any suggestions on how to determine how frequently data
> is read and re-read so we can make realistic expectations on what the
> performance will be like for our application usage.
>
> We're looking at using RAT to capture a typical workload and then play it
> back on a "sandbox standby" that we have running on the ZFSSA now, but we
> have to wait for a downtime window to do some re-linking. Wondering if
> there are any alternatives that we can also look at.
>
> --
> Don Seiler
> http://www.seiler.us
>
>
>


-- 
Don Seiler
http://www.seiler.us

Other related posts: