is it possible that each parallel slaving is grabbing the full amount of the expected temp space requirement? I have seen that happen in the past. Ie, if the query expects to need 8g, each parallel slave grabs 8g of temp space. On Thu, Aug 21, 2008 at 4:51 PM, cam <kadmon@xxxxxxxxx> wrote: > Hello all, > > I'm having terrible trouble with a complex query that is consuming huge > amounts of temp space. Sometimes. > > This query is large with several nested queries and uses parallelism. One > upshot of this is that the explain plans, already complex, are very > challenging to read with the addition of the PX send and receive events - > they're over a hundred lines long with deep nesting. In text form from > awrsqrpt they are truncated halfway thru the tmpspc column which doesn't > help since temp space is a specific concern. > > The instance has 320 GB (!) of temp space assigned and this query is using > it all, but the tmpspc report in the html form of the explain plan generated > by awrsqrpt seems to show a requirement of less than 10 GB. I'm monitoring > the tmp space usage with a little script which polls v$tempspace & > v$sort_usage and I can see each of 64 parallel servers using 8GB each. the > query runs for about 2 hours before failing with temp space exhaustion. > Reducing parallelism doesn't seem to have any effect on the eventual temp > requirement. > > On a test instance on another but fairly similar box, the temp space usage > is peaking at around 18 GB - this is with very similar driving table > volumes, identical parallelism settings etc. Run completes in under 2 hours. > Explain plans are different but it is difficult to track down how and why, > given their sheer size and complexity. Also, different 'successful' runs > with this smaller temp requirement on the test box are themselves using > quite different explain plans. > > This is 10.2.0.4, full table stats (i.e. estimate=>null, cascade=>true) > have been gathered for all tables involved in the joins and BVP has been > disabled with _OPTIM_PEEK_USER_BINDS=false to avoid histograms confusing the > picture for the moment. > > So, I suppose my questions are as follows for anyone willing to read this > far... I know I haven't posted much to go on - I'm pretty sure I'm unable to > post actual details for reasons of IPR etc. > > - is there a workaround for awrsqrpt truncating horizontal output? I'm > guessing it would just be to edit the script but haven't dared to yet... The > HTML version is an acceptable workaround but I thought it might be nice to > be able to use diff on text versions of plans. I'd also really like to be > able to see the parent_id of a rowset - at depths like this, the indentation > is close to useless for figuring out siblings and parents. > > - Why would the explain plans estimate for temspace requirement be so > wildly at odds with what is actually being consumed? > > - *Very broadly speaking*, I'm assuming that this massive use of temp space > is caused by too many hash joins of full table scans - is trying to force > more nested loops a reasonable strategy to reduce temp requirement? I'd be > inclined to use optimizer_index_cost_adj and _cache parameters for this > since I want to avoid hints if possible. Is this a good way to encourage > NLs? > > - Are there any tools/techniques for organising/viewing/comparing large > explain plans? > > Cheers for any insights, > cam > -- Andrew W. Kerber 'If at first you dont succeed, dont take up skydiving.'