RE: RE: cheapest way to access every 15th row in table ordered by timestamp column

I'll not touch a hair on your head. Since it was asked what is cheapest,
your solution would prevail for some possible row length characteristics.

 

If, by the way, there is an index (single column or relatively narrow set of
columns) on the timestamp AND each row is very long (what I believe Niall
means by a large dataset, although he could mean simply a big total size for
the table), THEN it could well be cheapest to force the desired rowids out
of the index in the inner part of the query and only bring back the required
columns in the outer query.

 

If, for example you can only fit 1 row per block but the index stores 500
keys per leaf then you'd have your range scan (it had time in the where
clause, right) and then you'd only have to actually visit the 1/15th of the
rows in the range to fetch the columns you need.

 

Of course if all the columns you need ARE IN the index, you won't need to
visit the table and this would be a bad strategy. But do notice that the
strategy is still a plausible cheapest solution even if you're only bringing
back one column not in the index IF the rows are long.

 

One other thing: If these folks are trying to reduce the vertical space (row
count) in the data they are looking at by sampling every 15th row they are
taking a pretty big risk of a non representative set. Now I don't know what
sort of science they are doing, and they may know their data well enough
that they think they have no worries on this, but I'd sure be a lot more
comfortable with 30 second frame averages and maybe some high-lo bounds and
variance (possibly as columns that only show up in the excel sheet if
something is outside expected limits. If the data is even worth looking at,
it is worth looking at it in a decent way.

 

If you're unlucky they might be annoyed at you for pointing it out. Good
scientists might say something like: "You mean you can do that for us? That
is way better but we were willing to live with a half-baked sample because
we thought that would be hard." Then you win big and the science is better.

 

mwf

 

  _____  

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Niall Litchfield
Sent: Wednesday, January 19, 2011 1:32 PM
To: Stephens, Chris
Cc: ORACLE-L
Subject: Re: RE: cheapest way to access every 15th row in table ordered by
timestamp column

 

For large datasets I'd probably store every 15th value in a second table.
For small enough ones Excel could filter. Yikes, heresy, kill me now, etc

On 19 Jan 2011 18:18, "Stephens, Chris" <Chris.Stephens@xxxxxxx> wrote:

That's a good question that I brought up with the users when this request
showed up.  Apparently they need the 2 second granularity for other
functions but wanted a way to reduce the size of the sample set for whatever
they are doing with this particular excel sheet. 

 

From: Niall Litchfield [mailto:niall.litchfield@xxxxxxxxx] 
Sent: Wednesday, January 19, 2011 12:09 PM


To: Stephens, Chris
Cc: ORACLE-L
Subject: Re: cheapest way to access every 15th row in table ordere...

 

What does every 15th row mean? Would only storing every 15th sample be more
appropriate? 


>
> On 19 Jan 2011 16:22, "Stephens, Chris" <Chris.Stephens@xxxxxxx> wrote:
>
> 11.2.0.2 on Linux 5...


CONFIDENTIALITY NOTICE:
This message is intended for the use of the individual or entity to which
i...

Other related posts: