Re: Wow - has anyone used ROW_NUMBER() to get around using a UNION statement when UNION ALL doesn't work??

  • From: Jeremy Schneider <jeremy.schneider@xxxxxxxxxxxxxx>
  • To: mwf@xxxxxxxx
  • Date: Mon, 08 Aug 2011 17:18:46 -0500

On 8/8/2011 11:27 AM, Mark W. Farnham wrote:
> In the event that you know a subset of the columns of a UNION ALL projection
> such that ordering by the column subset guarantees that duplicates appear
> together, then you can generate that special case of UNION ALL to UNION
> de-duplication at a lower cost. Chris has related one such mechanism, using
> the ROW_NUMBER() analytical function. It may have been unclear that he was
> relying on a subset of the columns of the projection. If he needed all the
> columns for the grouping, his mechanism would be at best a tie (and I write
> that in the sense that the best you can do in any transfer of energy is have
> zero increase in entropy, except you really can only break even in the math
> and in reality you always lose.)

I think that in his original email, Chris said he included all the
columns in the "partition by" clause.


On 8/5/2011 1:35 PM, Taylor, Chris David wrote:
> Now, I put the query containing the UNION ALL in an INNER select, and
> I select all columns from it PLUS the ROW_NUMBER() function
> partitioning by all the columns and applying an order by to the
> function and call this column “ROW_KEY” (not very original I know).


Which is why there are so many questions about this. As Mark has pointed
out, if you're partitioning on all the columns then you're doing exactly
the same sort that's the UNION is doing. Something just isn't quite
right about the results; both methods should be doing nearly the
identical work.

Or to put it another way, if manually sorting all the records and
removing dups addes 0 seconds and 0 TEMP tablespace to the subquery
without the ROW_NUMBER() analytic clause, then either the sort is not
really happening (the fastest way to do anything) or there's something
wrong with the UNION which ought to be running in the same amount of
time and resources. At least this is how it seems to me, and might be
the reason there are so many questions.

-Jeremy

-- 
http://www.ardentperf.com
+1 312-725-9249

Jeremy Schneider
Chicago

--
//www.freelists.org/webpage/oracle-l


Other related posts: