RE: Tuning large aggregation query

  • From: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
  • To: Thomas Kellerer <thomas.kellerer@xxxxxxxxxx>
  • Date: Tue, 15 Apr 2014 08:26:46 +0000

Thomas,

The easy one first - the commas between the hints are there because I was 
writing them as a list in a simple sentence in English and wasn't thinking of 
the explicit construction of the final hint set.  Ignore them.

Getting a hash join is, at least, a step in the right direction; but we need to 
know what the distribution method is, and how the outline defines the 
distribution method.  Can you show us the execution plan pulled from memory, or 
from the plan_table, when you get a hash join, and include the 'outline' option 
in the call to dbms_xplan.  (Let's have the predicate section as well).



Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle

________________________________________
From: Thomas Kellerer [thomas.kellerer@xxxxxxxxxx]
Sent: 14 April 2014 14:06
To: Jonathan Lewis
Subject: Re: Tuning large aggregation query

Jonathan,

thanks for your answer, and please excuse me for answering this late.

Unfortunately I could not get your execution plan using the hints you suggested.

I can force a hash join instead of the nested loops, but that doesn't improve 
things either.

Btw: what exactly does the comma in the hints do?

"leading (t1 t2), use_hash(t2)" generates a completely different execution plan 
than "leading (t1 t2) use_hash(t2)"

I couldn't find anything in the manual regarding the meaning of a comma between 
two hints.

Thanks again
Thomas Kellerer

Jonathan Lewis, 09.04.2014 07:46:
>
> If you go back temporarily to the select and ignore the merge, I think you 
> need a plan that looks something like this:
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> | Id  | Operation                              | Name     | Rows  | Bytes 
> |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> |   0 | SELECT STATEMENT                       |          | 49589 |  1743K|   
>     |  4135   (3)| 00:00:21 |       |       |        |      |            |
> |   1 |  PX COORDINATOR                        |          |       |       |   
>     |            |          |       |       |        |      |            |
> |   2 |   PX SEND QC (RANDOM)                  | :TQ10002 | 49589 |  1743K|   
>     |  4135   (3)| 00:00:21 |       |       |  Q1,02 | P->S | QC (RAND)  |
> |   3 |    HASH GROUP BY                       |          | 49589 |  1743K|   
>  51M|  4135   (3)| 00:00:21 |       |       |  Q1,02 | PCWP |            |
> |   4 |     PX RECEIVE                         |          | 49589 |  1743K|   
>     |  4135   (3)| 00:00:21 |       |       |  Q1,02 | PCWP |            |
> |   5 |      PX SEND HASH                      | :TQ10001 | 49589 |  1743K|   
>     |  4135   (3)| 00:00:21 |       |       |  Q1,01 | P->P | HASH       |
> |   6 |       HASH GROUP BY                    |          | 49589 |  1743K|   
>  51M|  4135   (3)| 00:00:21 |       |       |  Q1,01 | PCWP |            |
> |*  7 |        HASH JOIN                       |          |  1122K|    38M|   
>     |  3142   (2)| 00:00:16 |       |       |  Q1,01 | PCWP |            |
> |   8 |         BUFFER SORT                    |          |       |       |   
>     |            |          |       |       |  Q1,01 | PCWC |            |
> |   9 |          PART JOIN FILTER CREATE       | :BF0000  | 70129 |  1369K|   
>     |  2561   (1)| 00:00:13 |       |       |  Q1,01 | PCWP |            |
> |  10 |           PX RECEIVE                   |          | 70129 |  1369K|   
>     |  2561   (1)| 00:00:13 |       |       |  Q1,01 | PCWP |            |
> |  11 |            PX SEND PARTITION (KEY)     | :TQ10000 | 70129 |  1369K|   
>     |  2561   (1)| 00:00:13 |       |       |        | S->P | PART (KEY) |
> |  12 |             TABLE ACCESS BY INDEX ROWID| T1       | 70129 |  1369K|   
>     |  2561   (1)| 00:00:13 |       |       |        |      |            |
> |  13 |              INDEX FULL SCAN           | T1_UK    | 70129 |       |   
>     |   236   (2)| 00:00:02 |       |       |        |      |            |
> |  14 |         PX PARTITION LIST JOIN-FILTER  |          |  1122K|    17M|   
>     |   575   (3)| 00:00:03 |:BF0000|:BF0000|  Q1,01 | PCWC |            |
> |  15 |          TABLE ACCESS FULL             | T2       |  1122K|    17M|   
>     |   575   (3)| 00:00:03 |:BF0000|:BF0000|  Q1,01 | PCWP |            |
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> I think I've emulated your requirement:  (on 11.2.0.4): t1 is not partitioned 
> by has a unique index starting with the "region" column;
> t2 is list partitioned by "region" an there's a massive skew in the region 
> with two very popular regions (separate partitions)
> and about 20 other regions in the default partition.
>
> The critical step is the PX SEND PARTITION (key) which I got by hinting the 
> pq_distribute().
> My hints were:  leading (t1 t2), use_hash(t2) no_swap_join_inputs(t2), 
> pq_distribute(t2 partition none)
>
> The last hint effectively tells the optimizer to distribute the t1 data based 
> on
> the partition key of the t2 data so that the t2 data doesn't have to be 
> distributed.
>
>
> Regards
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
> @jloracle
--
//www.freelists.org/webpage/oracle-l


Other related posts: