RE: Exadata Tuning Question+

From: Iggy Fernandez <iggy_fernandez@xxxxxxxxxxx>
To: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>, ORACLE-L <oracle-l@xxxxxxxxxxxxx>
Date: Tue, 11 Nov 2014 05:22:28 -0800
Jonathan,
In plan 4110902776 posted by Abdul, the hash join at line 3 is predicted to 
return 6539 thousand rows(the cardinality of SERVICE_LOOKUP) probably because 
SERVICE_LOCATION_ID is the primary key of SERVICE_LOCATION and because there 
probably is a foreign key relationship between SERVICE_LOOKUP and 
SERVICE_LOCATION. Another guess is that there is a parent-child relationship 
between SERVICE_LOOKUP (parent) and SERVICE_RELATIONSHIP (child). Also, 
SERVICE_LOOKUP is not being filtered. I was therefore expecting 21 million rows 
(the cardinality of SERVICE_RELATIONSHIP) from the outer hash join at line 2.
In any case, since SERVICE_LOOKUP is not being filtered, your suggestion to 
"aggregate then join" is probably the best bet for Abdul. Hopefully, Abdul has 
enough sort space.
Iggy


From: jonathan@xxxxxxxxxxxxxxxxxx
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: Exadata Tuning Question+
Date: Tue, 11 Nov 2014 12:40:58 +0000














Iggy,



In the earlier posts we saw an estimated 25 rows from service_relationship for 
each level for each sid, and at one point you corrected a multiplier of 25 to 
125.  The initial explosion is actually short of that estimate: in simplistic 
terms we might have
 expected the join to take us from 7M to 875M rows, which is why I posted the 
comment about considering the trade off between "join then aggregate" and 
"aggregate then join".  The fact that the 206M rows from the hash join are 
predicted to return 206M rows
 after aggregation is just an unfortunate feature of the way that Oracle 
estimates aggregations.






   

Regards

Jonathan Lewis

http://jonathanlewis.wordpress.com

@jloracle 






From: oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on behalf 
of Iggy Fernandez [iggy_fernandez@xxxxxxxxxxx]

Sent: 11 November 2014 12:28

To: Ebadi, Abdul; ORACLE-L

Subject: RE: Exadata Tuning Question+









Reposting since my reply was not posted



Abdul,



I cannot understand why the number of rows exploded from 21 million to 206 
million in line 2. Surely, the number of rows cannot be more 21M because SID is 
(or appears to be) the primary key of SERVICE_LOOKUP and has a foreign
 key relationship to with SERVICE_RELATIONSHIP. Perhaps, primary key and 
foreign key constraints are not being enforced. I need to review the join 
conditions in the predicate section of
 the query plan. Because of the rows explosion, the required sort area size is 
10G. You can try increasing the sort_area_size to 10G. A better strategy might 
be to perform the group by
 before the join using the WITH CLAUSE or the PLACE_GROUP_BY hint.



Could you post the following two EXPLAIN PLAN (including the predicate sections)




EXPLAIN PLAN FOR
SELECT
  slv.sid,
  slv.service_location_id,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", 
DECODE(sr.relationship_level, '1', sr.RELATED_SID || ',', NULL)) ORDER BY 
sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", 
DECODE(sr.relationship_level, '2', sr.RELATED_SID || ',', NULL)) ORDER BY 
sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", 
DECODE(sr.relationship_level, '3', sr.RELATED_SID || ',', NULL)) ORDER BY 
sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", 
DECODE(sr.relationship_level, '4', sr.RELATED_SID || ',', NULL)) ORDER BY 
sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1,
  RTRIM(XMLSERIALIZE(CONTENT EXTRACT( XMLAGG(XMLELEMENT("e", 
DECODE(sr.relationship_level, '5', sr.RELATED_SID || ',', NULL)) ORDER BY 
sr.RELATED_SID), '//text()' )) , ',' ) AS relatedserviceinstanceidlevel1
FROM
  service_lookup slv
  LEFT OUTER JOIN service_location sl ON sl.service_location_id = 
slv.service_location_id
  LEFT OUTER JOIN service_relationship sr ON sr.sid = slv.sid AND 
sr.relationship_level IN ('1', '2', '3', '4', '5')
GROUP BY slv.id, slv.service_location_id;



EXPLAIN PLAN FOR
WITH sr AS (
  SELECT
    sr.id,
    LISTAGG(DECODE(sr.relationship_level, '1', sr.related_sid, NULL) ',') 
WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel1,
    LISTAGG(DECODE(sr.relationship_level, '2', sr.related_sid, NULL) ',') 
WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel2,
    LISTAGG(DECODE(sr.relationship_level, '3', sr.related_sid, NULL) ',') 
WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel3,
    LISTAGG(DECODE(sr.relationship_level, '4', sr.related_sid, NULL) ',') 
WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel4,
    LISTAGG(DECODE(sr.relationship_level, '5', sr.related_sid, NULL) ',') 
WITHIN GROUP (ORDER BY sr.related_sid) AS relatedserviceinstanceidlevel5
  WHERE sr.relationship_level IN ('1', '2', '3', '4', '5')
  GROUP BY sr.id
)
SELECT
  slv.sid,
  slv.service_location_id,
  sr.relatedserviceinstanceidlevel1,
  sr.relatedserviceinstanceidlevel2,
  sr.relatedserviceinstanceidlevel3,
  sr.relatedserviceinstanceidlevel4,
  sr.relatedserviceinstanceidlevel5
FROM
  service_lookup slv
  LEFT OUTER JOIN service_location sl ON sl.service_location_id = 
slv.service_location_id
  LEFT OUTER JOIN sr ON sr.sid = slv.sid;






From: Abdul.Ebadi@xxxxxxxxxx





Thanks for the reply Iggy…
 
Your second suggestion below runs faster, but according to our developer we 
“cannot use LISTAGG as the string is larger
 than 4000 chars for some of the values, so will have to use XMLAGG”
The first suggestion is OK and missing a group by, but when we added group by 
it didn’t finish even after long time.
 
Here is the plan for your second option (even switching to SORT MERGE join 
resulted in same thing – not finished):
 
Plan hash value: 4110902776
 
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes 
|TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |   206M|  9067M|   
    |   151K  (1)| 00:00:06 |
|   1 |  SORT GROUP BY               |                      |   206M|  9067M|   
 10G|   151K  (1)| 00:00:06 |
|*  2 |   HASH JOIN OUTER            |                      |   206M|  9067M|   
    |  2311   (7)| 00:00:01 |
|*  3 |    HASH JOIN                 |                      |  6539K|   149M|   
    |  1864   (4)| 00:00:01 |
|   4 |     TABLE ACCESS STORAGE FULL| SERVICE_LOCATION     |  3175K|    18M|   
    |   357   (1)| 00:00:01 |
|*  5 |     TABLE ACCESS STORAGE FULL| SERVICE_LOOKUP       |  6539K|   112M|   
    |  1503   (4)| 00:00:01 |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|*  6 |    TABLE ACCESS STORAGE FULL | SERVICE_RELATIONSHIP |    21M|   454M|   
    |   376   (8)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
 
Thanks,
Abdul
Follow-Ups:
- RE: Exadata Tuning Question+
  - From: Mark W. Farnham
References:
- Re: Exadata Tuning Question+
  - From: Tim Gorman
- RE: Exadata Tuning Question+
  - From: Ebadi, Abdul
- RE: Exadata Tuning Question+
  - From: Iggy Fernandez
- RE: Exadata Tuning Question+
  - From: Iggy Fernandez
- RE: Exadata Tuning Question+
  - From: Iggy Fernandez
- RE: Exadata Tuning Question+
  - From: Jonathan Lewis
RE: Exadata Tuning Question+

Other related posts: