RE: tricky group by questions

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <Mark.Bobak@xxxxxxxxxxxxxxx>, <ryan_gaffuri@xxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 27 Nov 2007 17:36:36 -0500

While that is indeed a very nice example and is probably exactly on point as
a solution except for the bit about rownum <= 12, if you're building a
production service it probably makes sense to build a permanent "spine"
table against which to perform the outer join to enumerate the times with
zero rows of actual data. In this case the spine table would simply be the
one column table of times by hour certain to bound your actual data. I
recommend you create that table with the data values in order by time, and
if you need more speed you can test whether it is better with an index, as
an index organized table, or as a hash cluster in the context of your
dataset and date range. I predict it would perform best if the "spine" table
is presented to the query as a virtual table with its bounds and a
solicitation of rownum so that it will be projected first and statically.
Then if nested loops is chosen by the optimizer it will be a single pass and
the sort will be a no-op. This of course is a trade-off of a little space
(miniscule, I'd guess) versus CPU at run time, and that is my presumptive
bias unless I know the resources headroom in an actual case is contrary to
that bias.

I predict such a solution will be faster, but I'm pretty doggone sure it
will be more readable. Using all_objects is a nice source of rows usually in
sufficient quantity (yet not enormous) for examples, but it is probably not
a good idea to repetitively query it in production. More importantly, I
wouldn't want someone wondering what all_objects has to do with the purpose
of the script. In examples, we're queued up to know that all_objects is just
a convenient stand-in for a row source, and we don't mind burning the cpu to
fashion whatever values we need for the example, but I would suggest that an
object for the purpose in this case will serve you better.

You also probably need a little attention to your greater-than less-than
values to get exactly the results you intend.

Regards,

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Bobak, Mark
Sent: Tuesday, November 27, 2007 11:29 AM
To: ryan_gaffuri@xxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: tricky group by questions

Ryan,

Here's a nice AskTom example, which I think will do exactly what you want:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:89123115
13313

--
Mark J. Bobak
Senior Database Administrator, System & Product Technologies
ProQuest
789 E. Eisenhower, Parkway, P.O. Box 1346
Ann Arbor MI 48106-1346
+1.734.997.4059  or +1.800.521.0600 x 4059
mark.bobak@xxxxxxxxxxxxxxx
www.proquest.com
www.csa.com

ProQuest...Start here. 

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of ryan_gaffuri@xxxxxxxxxxx
Sent: Tuesday, November 27, 2007 10:31 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: tricky group by questions

I am writing a query that is grouping by 1 hour blocks over a period of time
as follows

I am pretty sure the answer involves using "where not exists", but I can't
get the dates I want to return. 

select to_char(mydate, 'yyyymmdd hh24') , count(*)
from mytab
where mydate < sysdate
and mydate > to_date('20071125 1500', 'yyyymmdd hh24mi')
and mydatedate < to_date('20071125 1600', 'yyyymmdd hh24mi')
group by to_char(mydate, 'yyyymmdd hh24')
order by to_char(mydate, 'yyyymmdd hh24') desc

Now I have one hour periods that do not have any rows. A standard group by
just ignores those periods. I want periods with no data to return and have a
count(*) = 0

so I would have

2007111101 20
2007111102 0
2007111103 10

now it returns as

2007111101 20
2007111103 10 
--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l




--
//www.freelists.org/webpage/oracle-l


Other related posts: