Re: Performance issue on query doing smart scan

From: Jonathan Lewis <jlewisoracle@xxxxxxxxx>
To: Lok P <loknath.73@xxxxxxxxx>
Date: Sat, 23 Jan 2021 13:22:48 +0000

Looking at the SQL Monitor there are 4 key details to consider.

a) Almost all the time is spent in the table scan (181 partitions
apparently - Execs columns).  500 seconds smart scan, 400 second CPU.
That's a lot of CPU and it would be nice to know if it was used in applying
predicates or in decompressing storage units

b) You got 95% cell offload for the tablescan (but, strangely, 99.67% in
the summary) Either number suggests that predicates pushed to storage may
have been quite effective. The contradiction in the two percentages (and
the 450GB vs. 6TB for read bytes)  - makes me wonder whether you've been
caught in the "double-decompression trap some of the time, meaning the
decompressed data to be sent back to the server it larger than the 1MB
limit so the CU is decompressed at the cell, then sent compressed anyway
and decompressed again at the datbase. The session stats would give you a
clue about that.

c) (Estim)rows on the table is 18 Billion, (Actual) rows is 5 Million -
which suggests the Bloom filter was quite effective

d) (Most important of the 4) - you join 34 rows to 5M rows and produce 415
rows -- (Actual) column.

That last observation means that if the inputs and data sizes for this
query are typical then a nested loop join using a perfect index into the
partitioned table might have to decompress 415 (query high)  to find the
415 relevant rows - which would be a tiny amount of I/O and CPU compared to
the current load .  It's not quite that nice, though, because with the best
local index it looks as if you'd have to probe all 181 partitions of the
index for each of the 34 driving rows (and you might decide that you don't
want to create the index).

Regards
Jonathan Lewis

<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Fri, 22 Jan 2021 at 18:53, Lok P <loknath.73@xxxxxxxxx> wrote:

One correction, it's actually scanning ~180 partitions out of which 150
are compressed for "query high". Does compression be the cause of the smart
scan being served from flash cache?

Attached is the same query and its sql monitor. It does show TAB1 is doing
cell smart scans. Its range is partitioned on column part_dt daily holding
~56billion rows. Want to know, if there is any way I can make the query
finish in the same time without doing a smart scan?

Here table, TAB1 is daily range partitioned on column PART_DT.
Plan_line_id- 21 is where it spends most of the time and its reading a lot
of data over in that step. A bloom filter is applied on column AANUM and
DID for table TAB1 on that step. Out of which NUM_DISTICNT for column AANUM
is -173million and num_distinct for column DID is 333K. There exist two
different local composite indexes with column AANUM and DID being leading
in them individually. But no such index exists having AANUM and DID both
columns as composite keys. But again considering the size of the table and
its exposed to heavy DML, so not sure if it would be a good idea to create
a new index on (AANUM, DID) and whether that would really be beneficial?

Regards
Lok

On Fri, Jan 22, 2021 at 11:45 PM Lok P <loknath.73@xxxxxxxxx> wrote:

Hi All, I found in many articles(two of the samples is below) that the
smart scan wont use the flash cache by default. But in our case, we are
seeing a query doing the smart scan using flash cache heavily and making it
reach flash IOPS to ~200K and response time around 15-20ms. Are there any
bugs around that which we are hitting? or is there any other setting
driving this or my understanding is wrong here?

Actually in our case it's exadata X5 machine with image version 19.2,
half RAC with ~40TB of flash cache and its hosting single database on it
which is on version 11.2.0.4. One of the query which does cell smart scan
on a big partitioned table(TAB1 , daily range partitioned on column
PART_DT) and it scans through all the ~500+ partition(which is as per the
business requirement), the flash disk utilization went up to ~80% reaching
~200K IOPS with response time of ~15-20ms. And at the same time the hard
disk utilization and IOPS stays below ~40%. The flash disk is showing high
large reads during that interval and so index read/small reads are getting
impacted. I verified the FLASH_CACHE and DEFAULT_FLASH_CACHE values in
dba_tab_partitions both are DEFAULT for this object. Want to understand why
it's happening that way?

https://www.informit.com/articles/article.aspx?p=2418151&seqNum=3

http://guyharrison.squarespace.com/blog/2013/12/30/can-the-exadata-smart-flash-cache-slow-smart-scans.html

Regards

Lok

Follow-Ups:
- Re: Performance issue on query doing smart scan
  - From: Lok P

References:
- Performance issue on query doing smart scan
  - From: Lok P
- Re: Performance issue on query doing smart scan
  - From: Lok P

Re: Performance issue on query doing smart scan

Other related posts: