Re: Query help

From: Andrew Kerber <andrew.kerber@xxxxxxxxx>
To: Stéphane Faroult <sfaroult@xxxxxxxxxxxx>
Date: Thu, 12 Mar 2015 11:04:14 -0500
Oh. Yes, an earlier version reported by tablespace, I just never took it out. I 
see your point about block size , but we only use 8k block size, so for us it 
doesn't matter. 

Sent from my iPhone

> On Mar 12, 2015, at 9:51 AM, Stéphane Faroult <sfaroult@xxxxxxxxxxxx> wrote:
> 
> Just a question, why are you all keeping a join with v$tablespace and not 
> using anything that comes from it in the final result? For aggregation 
> purposes, a tablespace_id does as nicely as a tablespace name.
> In fact, I *might* use it to get the block size, because seeing an 8K 
> block-size being hard-coded makes me uncomfortable ... 
> 
> S Faroult
> 
>> On 12/03/15 09:43, stephen van linge (Redacted sender swvanlinge@xxxxxxxxx 
>> for DMARC) wrote:
>> If you're curious about the "at least not without some very tricky weird SQL 
>> that probably would cause optimizer meltdown" case that Kim was referring 
>> to, I threw one together for you (sorry I also removed the comma joins, I 
>> hate reading those).  I found the window function ordering was much quicker 
>> than the double aggregation, but we also have a much smaller system than you 
>> likely have (and a very short snapshot duration period), so consider it an 
>> academic response.
>>  
>> with group1 as (
>> SELECT to_char(s.begin_interval_time,'yyyy-mm-dd') AS get_date
>> , v.name AS ts_name
>> , row_number() OVER(PARTITION BY 
>> to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name ORDER BY 
>> round((t.tablespace_size*8192)/1024/1024/1024,2) DESC) AS size_gb_rank
>> , row_number() OVER(PARTITION BY 
>> to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name ORDER BY 
>> round((t.tablespace_usedsize*8192)/1024/1024/1024,2) DESC) AS 
>> usedsize_gb_rank
>> , round((t.tablespace_size*8192)/1024/1024/1024,2) AS size_gb
>> , round((tablespace_usedsize*8192)/1024/1024/1024,2) AS used_gb
>> from dba_hist_snapshot s
>> JOIN dba_hist_tbspc_space_usage t
>> ON s.snap_id=t.snap_id
>> JOIN v$tablespace v
>> ON t.tablespace_id=v.ts#
>> )
>> select get_date
>> , sum(CASE WHEN size_gb_rank = 1 THEN size_gb ELSE 0 END) AS tot_size
>> , sum(CASE WHEN usedsize_gb_rank = 1 THEN used_gb ELSE 0 END) AS used_size
>> from group1
>> group by get_date
>> order by get_date;
>>  
>> Thanks,
>>  
>> Stephen
>>  
>> From: Andrew Kerber <andrew.kerber@xxxxxxxxx>
>> To: Kim Berg Hansen <kibeha@xxxxxxxxx> 
>> Cc: "<oracle-l@xxxxxxxxxxxxx>" <Oracle-L@xxxxxxxxxxxxx> 
>> Sent: Thursday, March 12, 2015 7:24 AM
>> Subject: Re: Query help
>> 
>> Hi Kim. Thanks for the input. We do not move data between tablespaces so I 
>> wasn't even thinking about that. One of the items on my to learn list is a 
>> decent class on analytic functions in oracle. I'll probably be able to get 
>> to that once we implement the 30 hour day. 
>> 
>> Sent from my iPhone
>> 
>> 
>> 
>> On Mar 12, 2015, at 8:35 AM, Kim Berg Hansen <kibeha@xxxxxxxxx> wrote:
>> 
>>> Hi, Andrew
>>> 
>>> Well, the simple way to get rid of WITH is to replace with an inline view:
>>> 
>>>   select get_date, sum(size_gb) tot_size, sum(used_gb) used_size
>>>     from (  select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>>                  , v.name ts_name
>>>                  , (round(max((t.tablespace_size * 8192)) / 1024 / 1024 / 
>>> 1024, 2)) size_gb
>>>                  , (round(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 
>>> 1024, 2)) used_gb
>>>               from v$tablespace v, dba_hist_snapshot s, 
>>> dba_hist_tbspc_space_usage t
>>>              where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>>           group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name) 
>>> group1
>>> group by get_date
>>> order by get_date
>>> /
>>> 
>>> But I guess that's not what you are                             really 
>>> asking. You want to avoid doing TWO aggregations.
>>> 
>>> If the inner aggregation had been a SUM and the outer aggregation had been 
>>> a SUM of that sum - probably you would make do with just the outer 
>>> aggregation or some ROLLUP if you wished to keep both results of inner 
>>> aggregation or outer aggregation.
>>> 
>>> As it is, your inner aggregation uses MAX and the outer aggregation is a 
>>> SUM. I don't see how that could be combined with ROLLUP or similar. Nesting 
>>> aggregations I do like you have done - an inline view or with clause.
>>> 
>>> It's possible to nest an aggregate in an analytic function, like for 
>>> example:
>>> 
>>>   select to_char(s.begin_interval_time, 'yyyy-mm-dd') get_date
>>>        , v.name ts_name
>>>        , (round(sum(max((t.tablespace_size * 8192)) / 1024 / 1024 / 1024) 
>>> over (
>>>              partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>>           ), 2)) tot_size
>>>        , (round(sum(max((tablespace_usedsize * 8192)) / 1024 / 1024 / 1024) 
>>> over (
>>>              partition by to_char(s.begin_interval_time, 'yyyy-mm-dd')
>>>           ), 2)) used_size
>>>     from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>>    where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>> group by to_char(s.begin_interval_time, 'yyyy-mm-dd'), v.name
>>> order by get_date, ts_name
>>> /
>>> 
>>> But that does not give the result you want, as even though it calculates 
>>> the correct tot_size and used_size you want, it keeps the rows for each 
>>> date/tablespace combination rather than aggregating to date level.
>>> 
>>> I can't offhand think of a way to avoid two aggregations. Two aggregations 
>>> is the way to do what you want, as far as I can think of ;-)
>>> 
>>> 
>>> Do you actually want the maximum                             tablespace 
>>> size for each tablespace per day? Or would you actually want the latest 
>>> tablespace size for each tablespace per day?
>>> 
>>> I mean, if you during a day move a lot of data from one tablespace to 
>>> another, the max for the first tablespace will be a high value from before 
>>> the move, the max for the second tablespace will be a high value from after 
>>> the move. Then the sum for the day will be artificially high, as the amount 
>>> of moved data will count twice in the sum, won't it?
>>> 
>>> You could utilize the KEEP functionality of aggregates in the inner 
>>> aggregation to do something like this:
>>> 
>>>   select to_char(get_date, 'yyyy-mm-dd') the_date
>>>        , round(sum(ts_size) * 8192 / 1024 / 1024 / 1024, 2) tot_gb
>>>        , round(sum(ts_used) * 8192 / 1024 / 1024 / 1024, 2) used_gb
>>>     from (  select trunc(s.begin_interval_time) get_date
>>>                  , max(t.tablespace_size) keep (
>>>                       dense_rank last order by s.begin_interval_time
>>>                    ) ts_size
>>>                  , max(tablespace_usedsize) keep (
>>>                       dense_rank last order by s.begin_interval_time
>>>                    ) ts_used
>>>               from v$tablespace v, dba_hist_snapshot s, 
>>> dba_hist_tbspc_space_usage t
>>>              where t.tablespace_id = v.ts# and t.snap_id = s.snap_id
>>>           group by trunc(s.begin_interval_time), v.name) group1
>>> group by get_date
>>> order by get_date
>>> /
>>> 
>>> Using the KEEP with dense_rank last order by s.begin_interval_time means 
>>> that the MAX function only operates on those rows (within the group) that 
>>> have the latest begin_interval_time. So that query will not find the 
>>> highest values for a                             tablespace on a given day, 
>>> but the latest.
>>> 
>>> Whether that is what you want is another question - perhaps both versions 
>>> actually might be useful ;-)
>>> 
>>> 
>>> But avoiding two aggregates I just don't think is possible (at least not 
>>> without some very tricky weird SQL that probably would cause optimizer 
>>> meltdown ;-)
>>> 
>>> 
>>> 
>>> Regards
>>> 
>>> 
>>> Kim Berg Hansen
>>> 
>>> http://dspsd.blogspot.com
>>> kibeha@xxxxxxxxx
>>> @kibeha
>>> 
>>> 
>>> On Thu, Mar 12, 2015 at 1:55 PM, Andrew Kerber <andrew.kerber@xxxxxxxxx> 
>>> wrote:
>>> This is a learning experience for me.  I wrote the query below to pull 
>>> sizing trends at the database level.  Note it uses a with.  I think it can 
>>> be written with a single select and a roll up clause, or perhaps another 
>>> function I am not aware of. Any ideas?
>>> 
>>> with group1 as (select
>>> to_char(s.begin_interval_time,'yyyy-mm-dd') get_date,
>>> v.name ts_name,
>>> (round(max((t.tablespace_size*8192))/1024/1024/1024,2)) size_gb,
>>> (round(max((tablespace_usedsize*8192))/1024/1024/1024,2)) used_gb
>>> from v$tablespace v, dba_hist_snapshot s, dba_hist_tbspc_space_usage t
>>> where t.tablespace_id=v.ts#
>>> and t.snap_id=s.snap_id
>>> group by to_char(s.begin_interval_time,'yyyy-mm-dd'), v.name)
>>> select get_date, sum(size_gb) tot_size,sum(used_gb) used_size
>>> from group1
>>> group by get_date
>>> order by get_date
>>> 
>>> Sent from my iPad--
>>> //www.freelists.org/webpage/oracle-l
>
References:
- Re: Query help
  - From: Andrew Kerber
- Re: Query help
  - From: stephen van linge
- Re: Query help
  - From: Stéphane Faroult
Re: Query help

Other related posts: