Re: tune between query/join

  • From: Stephane Faroult <sfaroult@xxxxxxxxxxxx>
  • To: ax.mount@xxxxxxxxx
  • Date: Fri, 21 Oct 2011 10:16:40 +0200

Alex,
    Your case is actually a rather common problem - you have the same 
thing when you try to identify a country from an IP address, to check 
that a credit card number falls into an acceptable range, etc.

   The snag is that when you read the condition you tend to think that 
the given values are those that define the range (the lower and upper 
bounds), and that you are looking for the one that falls into the range 
(what happens in a classical range scan). In fact, it's the opposite: 
the value that is given is, in your case, b.amount (the value you get 
from the smallest table), and you search the range it falls into.

If you rewrite the condition as the equivalent

where a.lowest_amount <= b.amount
     and a.highest_amount >= b.amount

you understand the problem better - you have not one bounded range 
condition, but two unrelated and unbounded conditions on two columns. 
What I try to do in such a case is transform one of the two unbounded 
conditions into a range search, which requires some knowledge about the 
data and its order of magnitude. If you suppose (for instance) that the 
intervals all have an identical span, say of 10,000, you can write

where a.lowest_amount between b.amount - 10,000 and b.amount
     and a.highest_amount >= b.amount

Or you can use log() to get the order of magnitude of b.amount and 
define a floor and a ceiling values. Or you can define (for credit cards 
for instance) a "root value" (ie you know that the number to check is 
like "root_value%") and right pad to the expected length with 0s ans 
with 9s to get a lower and an upper value. You get the idea.

Create an index on (lowest_amount, highest_amount) and it will fly.

HTH,

-- 
Stephane Faroult
RoughSea Ltd <http://www.roughsea.com>
Konagora <http://www.konagora.com>
RoughSea Channel on Youtube <http://www.youtube.com/user/roughsealtd>

On 10/21/2011 07:41 AM, amonte wrote:
> Hello everyone
> I am running a query which looks like following (running in 10.2.0.4):
>
> select a.account_number, a.id, b.*
>    from account a, tmp_account b
> where b.amount between a.lowest_amount and a.highest_amount
>
> account has 8000000 rows and tmp_account 150000, the execution plan shows
> merge sort join as expected, this query takes around 5 hours to run, is
> there anyway to improve this?
>
>
> Alex
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>



--
//www.freelists.org/webpage/oracle-l


Other related posts: