Re: Tuning large aggregation query

From: Mohamed Houri <mohamed.houri@xxxxxxxxx>
To: kennethnaim@xxxxxxxxx
Date: Fri, 4 Apr 2014 16:13:59 +0200
I can see in the plan that there are several operations that have not been
started at all

----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name
             | Starts | E-Rows | A-Rows |   A-Time   |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | MERGE STATEMENT                                   |
            |      0 |        |      0 |00:00:00.01 |
|   1 |  MERGE                                            | TARGET_TABLE
             |      0 |        |      0 |00:00:00.01 |
|   2 |   PX COORDINATOR                                  |
            |      0 |        |      0 |00:00:00.01 |
|   3 |    PX SEND QC (RANDOM)                            | :TQ10002
             |      0 |   4061K|      0 |00:00:00.01 |
|   4 |     VIEW                                          |
            |      0 |        |      0 |00:00:00.01 |
|   5 |      NESTED LOOPS                                 |
            |      0 |   4061K|      0 |00:00:00.01 |
|   6 |       NESTED LOOPS                                |
            |      0 |   4061K|      0 |00:00:00.01 |
|   7 |        VIEW                                       |
            |      0 |   3900K|      0 |00:00:00.01 |
|   8 |         SORT GROUP BY                             |
            |      0 |   3900K|      0 |00:00:00.01 |
|   9 |          PX RECEIVE                               |
            |      0 |   3900K|      0 |00:00:00.01 |
|  10 |           PX SEND HASH                            | :TQ10001
             |      0 |   3900K|      0 |00:00:00.01 |
|  11 |            NESTED LOOPS                           |
            |      1 |   3900K|   4135K|00:05:09.10 |
|  12 |             NESTED LOOPS                          |
            |      1 |   3900K|   4135K|00:02:26.62 |
|  13 |              BUFFER SORT                          |
            |      1 |        |    229K|00:01:29.17 |
|  14 |               PX RECEIVE                          |
            |      1 |        |    229K|00:01:28.21 |
|  15 |                PX SEND ROUND-ROBIN                | :TQ10000
             |      0 |        |      0 |00:00:00.01 |
|  16 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| BASE_TABLE
             |      0 |   3900K|      0 |00:00:00.01 |
|  17 |                  INDEX FULL SCAN                  | PK_BASE_TABLE
            |      0 |   3900K|      0 |00:00:00.01 |
|  18 |              PARTITION LIST ITERATOR              |
            |    229K|      1 |   4135K|00:00:48.82 |
|* 19 |               INDEX RANGE SCAN                    | IDX_SALES_DATA
             |    229K|      1 |   4135K|00:00:36.71 |
|  20 |             TABLE ACCESS BY LOCAL INDEX ROWID     | SALES_DATA
             |   4135K|      1 |   4135K|00:02:21.94 |
|  21 |        PARTITION LIST ITERATOR                    |
            |      0 |      1 |      0 |00:00:00.01 |
|* 22 |         INDEX RANGE SCAN                          |
IDX_T1704_ADVSTOREITEM      |      0 |      1 |      0 |00:00:00.01 |
|* 23 |       TABLE ACCESS BY LOCAL INDEX ROWID           | TARGET_TABLE
             |      0 |      1 |      0 |00:00:00.01 |
----------------------------------------------------------------------------------------------------------------------------------


All those operations signaled in this color have not been started. For the
remaining operations, operation 20 seems to be the most consuming one.

Have you a plan with a predicate part?

What is the definition of the index IDX_SALES_DATA. The CBO is doing a
wrong estimation when accessing this index which might lead to a wrong
since it looks like the starting operation

Best regards
Mohamed Houri
www.hourim.wordpress.com



2014-04-04 16:02 GMT+02:00 Kenneth Naim <kennethnaim@xxxxxxxxx>:

> What wait events are seeing when the query is running? How many columns
> are in each of the tables?
>
> -----Original Message-----
> From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
> On Behalf Of Thomas Kellerer
> Sent: Friday, April 04, 2014 9:12 AM
> To: oracle-l@xxxxxxxxxxxxx
> Subject: Tuning large aggregation query
>
> Hi,
>
> for reporting purposes we are maintaining some denormalized,
> pre-aggregated tables in our OLTP database.
>
> These queries are aggregating sales and other related data from several
> source tables into one aggregation table.
>
> Our problem is that the aggregation of some 70 million rows into approx. 4
> million rows is quite slow.
>
> As far as I can tell, most of the time is lost when doing the actual
> aggregation.
> The join itself is quite fast, but aggregating 70 million rows needs a
> huge amount of memory (or temp-space)
>
> I uploaded to the execution plan to pastebin to preserve formatting:
> http://pastebin.com/XfEvB3z9 (You need to disable word-wrapping though)
>
> The basic structure of the statement looks like this:
>
> select base_table.pk_col_1,
>        base_table.pk_col_2,
>        base_table.pk_col_3,
>        sum(....),
>        sum(...),
>        min(...),
>        .... and another approx. 100 columns ...
> from base_table
>   join sales_data on ...
> group by base_table.pk_col_1,
>          base_table.pk_col_2,
>          base_table.pk_col_3
>
> There is a 1:N relation between base_table and sales_data.
>
> Sales_data essentially stores sales information for products on a per-day,
> sale and store level.
> The goal of the aggregation is to sum up that daily information to a
> product/store level.
>
> sales_data contains about 70 million rows, base_table contains about 4
> million rows, so the result of the aggregation is again 4 million rows.
> sales_data is partitioned by the region in which the stores are located
> (and that partitioning cannot be changed).
>
> We applied the same partitioning scheme to base_table and target_table in
> the hope that this will make the parallel execution more efficient, but
> changing from unpartitioned to partitioned didn't improve the queries (but
> didn't make them worse either). We tried various indexes (in additioin to
> the PK index that is already used for the JOIN). But apparently for the
> type of query there isn't much an index can do.
>
> The main time is spent doing the aggregation.
> I confirmed that by only running the select statement that's inside the
> merge and that takes nearly as long as the merge itself.
>
> We were able to improve the whole process a bit by using parallel
> execution, but the runtimes are still way too high for the "window" we have
> in the night to prepare the data. The target would be that a single merge
> like that runs in about 10 minutes, currently it's around 60minutes.
>
> The test server runs Oracle 11.2.0.4 on a Windows 2008R2 server with 64GB
> RAM, a 8 spindle RAID 10 attached and 6 cores (12 CPUs).
>
> I can provide more Oracle configuration parameters if needed.
>
> The production server also runs Oracle 11.2.0.4 on Windows Server 2008R2,
> has 32 cores (64CPUs), 192GB RAM But we will not be able to test this on
> the real production hardware - and the number of rows in production will
> probably be 4-5 times bigger.
>
>
> I am looking for ideas on how to improve the runtime of those statements.
> Just some pointers on what things I could try out to either optimize the
> query itself or tune the Oracle configuration
>
> Thanks
> Thomas
>
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>


-- 
Bien Respectueusement
Mohamed Houri
Follow-Ups:
- RE: Tuning large aggregation query
  - From: Jonathan Lewis
References:
- Tuning large aggregation query
  - From: Thomas Kellerer
- RE: Tuning large aggregation query
  - From: Kenneth Naim
Re: Tuning large aggregation query

Other related posts: