RE: Large Dataset - Estimated 87TB needed for TEMP - suggestions?

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <dmarc-noreply@xxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 27 Jan 2015 12:52:39 -0500

Assuming we don’t get to change the question (which *might indeed be the best 
solution, but which has already been suggested*) there are a few ways to 
approach this:

 

n  just mentioned by Mladen -  bitmaps and a star transformation, possibly 
including bitmap joins

n  spiny starfish strategy (see my paper)

n  divide and conquer and concatenate the result sections via union all

o   suppose you have a useful distribution by STS_DPS_TYPE

o   suppose you have few enough STS_DPS_TYPE distinct values or a reasonable 
way to carve the distinct values up by range yielding a similar aggregate row 
count from each range

o   each of the similar queries linked by union alls would then be smaller, 
possibly exponentially so

o   this can be combined with the spiny starfish strategy

 

Further observations: 

You are fetching exactly one AND a.STS_PERF_PERIOD = '20141',

so do not select that data forcing you to group by it.

 

union all pieces < 36 and > 36 might give you range scans instead of a ffs on c 
if 36 is at a reasonable break point.

 

Don’t haul any text descriptors along you don’t need until the group by totals 
are complete and you have the minimum rowset result upon which to tack the 
labels.

That might mean you need some look up tables with ids shorter than the text 
descriptors at the beginning.

 

Good luck.

 

 

mwf

 

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Mladen Gogala (Redacted sender "mgogala@xxxxxxxxx" for DMARC)
Sent: Tuesday, January 27, 2015 12:06 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Large Dataset - Estimated 87TB needed for TEMP - suggestions?

 

On 01/27/2015 09:11 AM, Chris Taylor wrote:

Env; 11.2.0.2 Linux 5.8

Machine: 

4 CPU (8 Core, 16 Threads E5-4620)

256 GB RAM

 

This is a "data warehouse" DB but doesn't use DIM/FACT tables - more like a 
data dump database.

 

I have a query against 3 tables (2 of which are partitioned) and the estimated 
TEMP space used for this query is going to be 81 TB.  I'm looking for 
suggestions on how to get this data set.  For the partitioned tables, 1 table 
accesses 1 partition, and the other table uses 2 partitions.  I've included the 
explain plan below and any help/suggestions are appreciated.  (I've updated the 
table stats yesterday after I kept blowing out my 150GB temp space and the 
estimated temp space used was 22GB.  After updating stats, it tells me the 
estimated Temp Space is going to be 81 TB).

 

PGA Agg Target is 20 GB right now.

 

I've explained the query using both the default HASH GROUP BY and a SORT GROUP 
BY and both estimate using 81 T of temp space.

 

Here's the query and the plan (let me know if the fixed formatting doesn't show 
up correctly and I'll use pastebin to format it if needed)

 

  SELECT a.STS_PERF_PERIOD,

         a.PART_ACCT_NO,

         a.PWN_PART_AKA_NME,

         a.PWN_P_W_IND,

         a.AFL_SOCIETY_NME,

         b.sts_CHANNEL_TYPE,

         c.stp_s03_program_cd--TOTAL Music Minutes CALCULATION:

         ,

           SUM (

              CASE

                 WHEN STG_STATUS = 'DP'

                 THEN

                    0

                 ELSE

                      a.perf_count

                    * ( (  a.BMI_FEATURE

                         + a.BMI_BG

                         + a.BMI_LOGO

                         + a.BMI_THEME

                         + a.BMI_BV

                         + a.ASCAP_FEATURE

                         + a.ASCAP_BG

                         + a.ASCAP_LOGO

                         + a.ASCAP_THEME

                         + a.ASCAP_BV

                         + a.FOREIGN_FEATURE

                         + a.FOREIGN_BG

                         + a.FOREIGN_LOGO

                         + a.FOREIGN_THEME

                         + a.FOREIGN_BV

                         + a.PD_FEATURE

                         + a.PD_BG

                         + a.PD_LOGO

                         + a.PD_THEME

                         + a.PD_BV

                         + a.SESAC_FEATURE

                         + a.SESAC_BG

                         + a.SESAC_LOGO

                         + a.SESAC_THEME

                         + a.SESAC_BV

                         + a.NA_FEATURE

                         + a.NA_BG

                         + a.NA_LOGO

                         + a.NA_THEME

                         + a.NA_BV))

              END)

         / 60

            AS "TOT_MUSIC_MINS"--BMI Music Minutes CALCULATION:

         ,

           SUM (

              CASE

                 WHEN STG_STATUS = 'DP'

                 THEN

                    0

                 ELSE

                      a.perf_count

                    * ( (  a.BMI_FEATURE

                         + a.BMI_BG

                         + a.BMI_LOGO

                         + a.BMI_THEME

                         + a.BMI_BV))

              END)

         / 60

            AS "BMI_MUSIC_MINS"

    FROM PRDM.COWS_PURPLE_PART  a, --partition(P20141)

         prdm.cows_TCUE_STS_SCHEDULE b, --partition(P20141,P20142) ,

         prdm.cows_tcue_stp_program c

   WHERE     A.STS_PERF_PERIOD = b.sts_perf_period

         AND A.STS_DPS_TYPE = b.sts_dps_type

         AND A.STG_GROUP_NO = b.sts_group_no

         AND a.stg_group_seq_no = b.sts_group_seq_no

         AND b.sts_provider = c.stp_provider

         AND b.sts_program_no = c.stp_program_no

         AND a.STS_PERF_PERIOD = '20141'

         AND b.sts_perf_source = 'CB'

         AND b.COWS_DIST_PERIOD in ('20141','20142')

         AND a.PWN_P_W_IND = 'W'

         AND c.stp_s03_program_cd <> '36'

GROUP BY a.STS_PERF_PERIOD,

         a.PART_ACCT_NO,

         a.PWN_PART_AKA_NME,

         a.PWN_P_W_IND,

         a.AFL_SOCIETY_NME,

         b.sts_CHANNEL_TYPE,

         c.stp_s03_program_cd

/

 

 

Plan hash value: 2657901994

 

----------------------------------------------------------------------------------------------------------------------------------------------

| Id  | Operation                               | Name                      | 
E-Rows |E-Bytes|E-Temp | Cost (%CPU)| E-Time   | Pstart| Pstop |

----------------------------------------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                        |                           |   
 289G|    57T|       |   451M  (9)|501:19:35 |       |       |

|   1 |  HASH GROUP BY                          |                           |   
 289G|    57T|    81T|   451M  (9)|501:19:35 |       |       |

|*  2 |   HASH JOIN                             |                           |   
 289G|    57T|    42M|  2176K (91)| 02:25:07 |       |       |

|   3 |    PART JOIN FILTER CREATE              | :BF0000                   |   
 680K|    35M|       | 39040   (2)| 00:02:37 |       |       |

|*  4 |     HASH JOIN                           |                           |   
 680K|    35M|  4304K| 39040   (2)| 00:02:37 |       |       |

|   5 |      INLIST ITERATOR                    |                           |   
     |       |       |            |          |       |       |

|   6 |       PARTITION RANGE ITERATOR          |                           |  
80046 |  3361K|       |  3063   (1)| 00:00:13 |KEY(I) |KEY(I) |

|   7 |        TABLE ACCESS BY LOCAL INDEX ROWID| COWS_TCUE_STS_SCHEDULE    |  
80046 |  3361K|       |  3063   (1)| 00:00:13 |KEY(I) |KEY(I) |

|*  8 |         INDEX RANGE SCAN                | COWS_TCUE_STS_SCHED_IDX01 |   
9052 |       |       |   147   (1)| 00:00:01 |KEY(I) |KEY(I) |

|*  9 |      INDEX FAST FULL SCAN               | COWS_TCUE_STP_PROG_IDX01  |   
  24M|   256M|       | 12602   (3)| 00:00:51 |       |       |

|  10 |    PARTITION RANGE SINGLE               |                           |   
  10M|  1582M|       | 96246   (2)| 00:06:25 |KEY(AP)|KEY(AP)|

|* 11 |     TABLE ACCESS FULL                   | COWS_PURPLE_PART          |   
  10M|  1582M|       | 96246   (2)| 00:06:25 |    21 |    21 |

----------------------------------------------------------------------------------------------------------------------------------------------

 

Query Block Name / Object Alias (identified by operation id):

-------------------------------------------------------------

 

   1 - SEL$1

   7 - SEL$1 / B@SEL$1

   8 - SEL$1 / B@SEL$1

   9 - SEL$1 / C@SEL$1

  11 - SEL$1 / A@SEL$1

 

Outline Data

-------------

 

  /*+

      BEGIN_OUTLINE_DATA

      USE_HASH_AGGREGATION(@"SEL$1")

      USE_HASH(@"SEL$1" "A"@"SEL$1")

      USE_HASH(@"SEL$1" "C"@"SEL$1")

      LEADING(@"SEL$1" "B"@"SEL$1" "C"@"SEL$1" "A"@"SEL$1")

      FULL(@"SEL$1" "A"@"SEL$1")

      INDEX_FFS(@"SEL$1" "C"@"SEL$1" 
("COWS_TCUE_STP_PROGRAM"."STP_S03_PROGRAM_CD" 
"COWS_TCUE_STP_PROGRAM"."STP_PROVIDER" 

              "COWS_TCUE_STP_PROGRAM"."STP_PROGRAM_NO"))

      INDEX_RS_ASC(@"SEL$1" "B"@"SEL$1" 
("COWS_TCUE_STS_SCHEDULE"."COWS_DIST_PERIOD" 
"COWS_TCUE_STS_SCHEDULE"."STS_PERF_PERIOD" 

              "COWS_TCUE_STS_SCHEDULE"."STS_PERF_SOURCE"))

      OUTLINE_LEAF(@"SEL$1")

      NO_PARALLEL

      DB_VERSION('11.2.0.2')

      OPTIMIZER_FEATURES_ENABLE('11.2.0.2')

      IGNORE_OPTIM_EMBEDDED_HINTS

      END_OUTLINE_DATA

  */

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

   2 - access("A"."STS_PERF_PERIOD"="B"."STS_PERF_PERIOD" AND 
"A"."STS_DPS_TYPE"="B"."STS_DPS_TYPE" AND 

              "A"."STG_GROUP_NO"="B"."STS_GROUP_NO" AND 
"A"."STG_GROUP_SEQ_NO"="B"."STS_GROUP_SEQ_NO")

   4 - access("B"."STS_PROVIDER"="C"."STP_PROVIDER" AND 
"B"."STS_PROGRAM_NO"="C"."STP_PROGRAM_NO")

   8 - access(("B"."COWS_DIST_PERIOD"='20141' OR 
"B"."COWS_DIST_PERIOD"='20142') AND "B"."STS_PERF_PERIOD"='20141' AND 

              "B"."STS_PERF_SOURCE"='CB')

   9 - filter("C"."STP_S03_PROGRAM_CD"<>'36')

  11 - filter("A"."PWN_P_W_IND"='W' AND "A"."STS_PERF_PERIOD"='20141')

 

Column Projection Information (identified by operation id):

-----------------------------------------------------------

 

   1 - (#keys=7) "A"."STS_PERF_PERIOD"[VARCHAR2,5], 
"A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30], 

       "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], 
"B"."STS_CHANNEL_TYPE"[VARCHAR2,7], 

       "C"."STP_S03_PROGRAM_CD"[VARCHAR2,2], SUM(CASE "STG_STATUS" WHEN 'DP' 
THEN 0 ELSE 

       
"A"."PERF_COUNT"*("A"."BMI_FEATURE"+"A"."BMI_BG"+"A"."BMI_LOGO"+"A"."BMI_THEME"+"A"."BMI_BV")
 END )[22], SUM(CASE "STG_STATUS" WHEN 

       'DP' THEN 0 ELSE 
"A"."PERF_COUNT"*("A"."BMI_FEATURE"+"A"."BMI_BG"+"A"."BMI_LOGO"+"A"."BMI_THEME"+"A"."BMI_BV"+"A"."ASCAP_FEATURE"+"A".

       
"ASCAP_BG"+"A"."ASCAP_LOGO"+"A"."ASCAP_THEME"+"A"."ASCAP_BV"+"A"."FOREIGN_FEATURE"+"A"."FOREIGN_BG"+"A"."FOREIGN_LOGO"+"A"."FOREIGN_TH

       
EME"+"A"."FOREIGN_BV"+"A"."PD_FEATURE"+"A"."PD_BG"+"A"."PD_LOGO"+"A"."PD_THEME"+"A"."PD_BV"+"A"."SESAC_FEATURE"+"A"."SESAC_BG"+"A"."SE

       
SAC_LOGO"+"A"."SESAC_THEME"+"A"."SESAC_BV"+"A"."NA_FEATURE"+"A"."NA_BG"+"A"."NA_LOGO"+"A"."NA_THEME"+"A"."NA_BV")
 END )[22]

   2 - (#keys=4) "A"."STS_PERF_PERIOD"[VARCHAR2,5], 
"C"."STP_S03_PROGRAM_CD"[VARCHAR2,2], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], 

       "A"."FOREIGN_LOGO"[NUMBER,22], "A"."PD_LOGO"[NUMBER,22], 
"A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2], 

       "A"."NA_LOGO"[NUMBER,22], "A"."SESAC_LOGO"[NUMBER,22], 
"A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30], 

       "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], 
"A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22], 

       "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], 
"A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22], 

       "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], 
"A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22], 

       "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], 
"A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22], 

       "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], 
"A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22], 

       "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], 
"A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22], 

       "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22]

   3 - "B"."STS_PERF_PERIOD"[VARCHAR2,5], "B"."STS_PERF_PERIOD"[VARCHAR2,5], 
"B"."STS_DPS_TYPE"[VARCHAR2,10], 

       "B"."STS_GROUP_NO"[NUMBER,22], "B"."STS_GROUP_SEQ_NO"[NUMBER,22], 
"C"."STP_S03_PROGRAM_CD"[VARCHAR2,2], 

       "B"."STS_CHANNEL_TYPE"[VARCHAR2,7]

   4 - (#keys=2) "B"."STS_PERF_PERIOD"[VARCHAR2,5], 
"B"."STS_GROUP_NO"[NUMBER,22], "B"."STS_GROUP_SEQ_NO"[NUMBER,22], 

       "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], "B"."STS_DPS_TYPE"[VARCHAR2,10], 
"C"."STP_S03_PROGRAM_CD"[VARCHAR2,2]

   5 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], 
"B"."STS_GROUP_NO"[NUMBER,22], 

       "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], 
"B"."STS_DPS_TYPE"[VARCHAR2,10], 

       "B"."STS_PERF_PERIOD"[VARCHAR2,5]

   6 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], 
"B"."STS_GROUP_NO"[NUMBER,22], 

       "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], 
"B"."STS_DPS_TYPE"[VARCHAR2,10], 

       "B"."STS_PERF_PERIOD"[VARCHAR2,5]

   7 - "B"."STS_PROGRAM_NO"[NUMBER,22], "B"."STS_PROVIDER"[VARCHAR2,1], 
"B"."STS_GROUP_NO"[NUMBER,22], 

       "B"."STS_GROUP_SEQ_NO"[NUMBER,22], "B"."STS_CHANNEL_TYPE"[VARCHAR2,7], 
"B"."STS_DPS_TYPE"[VARCHAR2,10], 

       "B"."STS_PERF_PERIOD"[VARCHAR2,5]

   8 - "B".ROWID[ROWID,10], "B"."STS_PERF_PERIOD"[VARCHAR2,5]

   9 - "C"."STP_PROGRAM_NO"[NUMBER,22], "C"."STP_PROVIDER"[VARCHAR2,1], 
"C"."STP_S03_PROGRAM_CD"[VARCHAR2,2]

  10 - "A"."STS_PERF_PERIOD"[VARCHAR2,5], "A"."STS_DPS_TYPE"[VARCHAR2,10], 
"A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2], 

       "A"."STG_GROUP_NO"[NUMBER,22], "A"."STG_GROUP_SEQ_NO"[NUMBER,22], 
"A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30], 

       "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], 
"A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22], 

       "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], 
"A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22], 

       "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], 
"A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22], 

       "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], 
"A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22], 

       "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], 
"A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22], 

       "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], 
"A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22], 

       "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22], 
"A"."SESAC_LOGO"[NUMBER,22], "A"."NA_LOGO"[NUMBER,22], 

       "A"."PD_LOGO"[NUMBER,22], "A"."FOREIGN_LOGO"[NUMBER,22]

  11 - "A"."STS_PERF_PERIOD"[VARCHAR2,5], "A"."STS_DPS_TYPE"[VARCHAR2,10], 
"A"."PERF_COUNT"[NUMBER,22], "STG_STATUS"[VARCHAR2,2], 

       "A"."STG_GROUP_NO"[NUMBER,22], "A"."STG_GROUP_SEQ_NO"[NUMBER,22], 
"A"."PART_ACCT_NO"[NUMBER,22], "A"."PWN_PART_AKA_NME"[VARCHAR2,30], 

       "A"."PWN_P_W_IND"[VARCHAR2,1], "A"."AFL_SOCIETY_NME"[VARCHAR2,8], 
"A"."ASCAP_BG"[NUMBER,22], "A"."BMI_BG"[NUMBER,22], 

       "A"."SESAC_BG"[NUMBER,22], "A"."NA_BG"[NUMBER,22], 
"A"."PD_BG"[NUMBER,22], "A"."FOREIGN_BG"[NUMBER,22], "A"."ASCAP_BV"[NUMBER,22], 

       "A"."BMI_BV"[NUMBER,22], "A"."SESAC_BV"[NUMBER,22], 
"A"."NA_BV"[NUMBER,22], "A"."PD_BV"[NUMBER,22], "A"."FOREIGN_BV"[NUMBER,22], 

       "A"."ASCAP_THEME"[NUMBER,22], "A"."BMI_THEME"[NUMBER,22], 
"A"."SESAC_THEME"[NUMBER,22], "A"."NA_THEME"[NUMBER,22], 

       "A"."PD_THEME"[NUMBER,22], "A"."FOREIGN_THEME"[NUMBER,22], 
"A"."ASCAP_FEATURE"[NUMBER,22], "A"."BMI_FEATURE"[NUMBER,22], 

       "A"."SESAC_FEATURE"[NUMBER,22], "A"."NA_FEATURE"[NUMBER,22], 
"A"."PD_FEATURE"[NUMBER,22], "A"."FOREIGN_FEATURE"[NUMBER,22], 

       "A"."ASCAP_LOGO"[NUMBER,22], "A"."BMI_LOGO"[NUMBER,22], 
"A"."SESAC_LOGO"[NUMBER,22], "A"."NA_LOGO"[NUMBER,22], 

       "A"."PD_LOGO"[NUMBER,22], "A"."FOREIGN_LOGO"[NUMBER,22]

 

Note

-----

   - automatic DOP: skipped because of IO calibrate statistics are missing

   - Warning: basic plan statistics not available. These are only collected 
when:

       * hint 'gather_plan_statistics' is used for the statement or

       * parameter 'statistics_level' is set to 'ALL', at session or system 
level

 

 


This looks like an ordinary hash join, not a star transformation. Is there any 
way you could use bitmap indexes to achieve the star transformation, which 
really the only way that will make such a huge fact/detail query run reasonably 
fast. If only there was some kind of a physical contraption that could use 
Bloom filter to quickly eliminate unnecessary blocks from the fact table during 
the full table scan. And if it use some kind of compression that would lower 
the number of blocks to be read, that would be an additional boost.




-- 
Mladen Gogala
Oracle DBA
http://mgogala.freehostia.com

Other related posts: