Re: Unique index access path seems very slow

  • From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 23 Jan 2023 16:51:23 -0500

On 1/23/23 13:46, Pap wrote:

This database is on 19C of Oracle exadata. Below is the query and the sql monitor for both the FULL scan path and index access path. Also I had captured the non-zero stats from gv$sesstat for ~10minutes execution for each of those runs. The index access path has just processed ~82K in 30minutes using the index access path and has still not finished yet. We want to make this query finish in ~2-3minutes.

SELECT  TRAN_TAB.COL1_SHA_512, TAB_ENCRYPT.TAB_ENCRYPT_COL2.....
  FROM (SELECT  RAWTOHEX ( STANDARD_HASH (TO_CHAR (TRIM (  TRAN_TAB.LN4)),'SHA512'))    AS COL1_SHA_512
          FROM  TRAN_TAB
         WHERE    TRAN_TAB.PD_CD(+) = 'XXX'
          AND TRAN_TAB.PART_DT = to_date(:B1,'DD-MON-YYYY HH24:MI:SS')
 AND TRAN_TAB.dt_cr between to_date(:B2,'DD-MON-YYYY HH24:MI:SS') and to_date(:B3,'DD-MON-YYYY HH24:MI:SS')
 ) TRAN_TAB, TAB_ENCRYPT
 WHERE TRAN_TAB.COL1_SHA_512 = TAB_ENCRYPT.COL1_SHA_512(+)

****** "Full scan" path sql monitor + nonzero results from gv$sesstat from initial ~10minutes run*******

https://gist.github.com/oracle9999/b37160097fa8e3929fb66af61ebbf9ed

***** "Index access" path sql monitor + nonzero results from gv$sesstat from initial ~10minutes run*******

https://gist.github.com/oracle9999/9792a70f748b71c365397c5458d3e9e8

The requirement is something as below...

We have two tables TRAN_TAB and TAB_ENCRYPT. Table(TAB_ENCRYPT) is an insert only table and holds the encrypted value(COL1_SHA_512) of a confidential column/attribute. Each row of the transaction table(TRAN_TAB) has to be checked/joined against all the encrypted column values of table TAB_ENCRYPT using the above query and the results along with additional columns from table tab_encrypt have to be sent to another downstream system. This query is supposed to run multiple times in a day so as to send the results to the downstream system as real time as possible based on the records in the transaction table tran_tab. Currently the plan is to run it once in ~5minutes. The table TAB_ENCRYPT holds ~3billion rows. And the transaction table TRAN_TAB will have Approx ~400 million rows per day/part_dt to get it joined with the encrypted column value- COL1_SHA_512.

As in the sql monitor, the original query was taking ~30minutes to join with ~3million rows in the transaction table. And the majority of the resources seems to be spent while full scanning the table TAB_ENCRYPT. And thus we tried creating a unique index on column "COL1_SHA_512" thinking it will be fast, but it appears to be further degrading the performance of this query. Want to understand why it's happening this way. Is it because the column holds ~128byte values so the index becomes too big to fit in cache? and then how can we fix this query?

The column "COL1_SHA_512" , holds the SHA512 encrypted value of another column. The sample encrypted values look something like below. The size of the index on this column is ~700GB. And the statistics of the index is as below. It contains all unique values though but i think because of its 128byte length the size of the index becomes this huge. The size of the table tran_tab is ~880GB.

"99983FB4F3BEAA516BE40B85706338B0D7E5D14D1B2A3C945D5F74947A3E8DA8E714D1761D008715CD46B6C6CDE8B99690F2AE04D97D9KHYU67GBF4589HNFRSDR"

*Column 'COL1_SHA_512' statistics:-
*NUM_DISTINCT NULLABLE DENSITY              NUM_NULLS    AVG_COL_LEN
2987284438         N 0.000000000334752187397831             0                   129

*Unique Index(on column COL1_SHA_512) Statistics:-
*INDEX_NAME              BLEVEL      PCT_FREE  LEAF_BLOCKS           CLUSTERING_FACTOR         NUM_ROWS
TAB_ENCRYPT_UK    3                   10 83389146                        3008764650      3008764650

Regards
Pap



Pap, your index plan is attempting to read 20 million rows from the outer table and join the result to the inner table using nested loops. Yes, Bloom filter will eliminate some rows from the outer table but there will still be a lot to read. There will be 80295 read requests and 627MB to read from the table itself and 152K reads totalling 1GB from the index. Depending on the model of your Exadata, cell single block read can be relatively slow. You should take a look at your AWR report. If your cell single block read average time is in milliseconds, that would explain your problem. Exadata was made to speed up full table scan. It doesn't do much for the single block reads.

However, 80295 rows out of 20M is a relatively small portion of data. Is there a possibility to use index range scan on the outer table?

--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Other related posts: