RE: Performance aka. Table Joins

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <breitliw@xxxxxxxxxxxxx>, <genegurevich@xxxxxxxxxxxxxxxxxxxxx>
  • Date: Thu, 25 May 2006 14:08:50 -0400

Cool test, Wolfgang.

Did your indexes get used at all?

Is it possible that you have just demonstrated that hash joins are faster on
character columns than on numbers?

I got 3 full table scans and hash joins when I ran this test on my laptop.
I'm curious what plan(s) you got.

My system must be much slower than yours, I got 16.73 for the character and
16.93 for the numeric version.

Still, character did win.

Regards,

mwf

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]On
Behalf Of Wolfgang Breitling
Sent: Thursday, May 25, 2006 11:51 AM
To: genegurevich@xxxxxxxxxxxxxxxxxxxxx
Cc: oracle-l
Subject: Re: Performance aka. Table Joins

Several posters have made excellent comments about the use of number
datatypes
vs. character datatypes for columns that are, or may be, involved in joins.
First off I want to say that join performance should be (one of) the last
considerations in determining the datatype of a column. The datatype should
first and foremost be appropriate for storing the intended content.

In the other posts the predominant gist was
a) test it
b) numbers ought to have an edge as far as performance is concerned

Now let me give you this testcase:

09:40:05 ora101.scott> @c:\tmp\jointest

Session altered.

Elapsed: 00:00:00.01
  1  select a.n1, b.n2, c.n3, a.c1, b.c1, c.c1
  2  from ch a, ch b, ch c
  3  where a.n1 = b.n2
  4    and b.n3 = c.n1
  5*   and c.n2 = a.n3
timing for: ch
Elapsed: 00:00:02.89
SP2-0325: no timing elements to show
  1  select a.n1, b.n2, c.n3, a.c1, b.c1, c.c1
  2  from nr a, nr b, nr c
  3  where a.n1 = b.n2
  4    and b.n3 = c.n1
  5*   and c.n2 = a.n3
timing for: nr
Elapsed: 00:00:04.64
SP2-0325: no timing elements to show

    HASH_VALUE   ex      ELA    rows SQL_TEXT
-------------- ---- -------- ------- ---------------------------------------
---
     515903459    1    2.617   3,300 select a.n1, b.n2, c.n3, a.c1, b.c1,
c.c1
    2845657776    1    4.350   3,300 select a.n1, b.n2, c.n3, a.c1, b.c1,
c.c1

Here obviously the character join has the clear advantage. :-)
Just to demonstrate that things are rarely so clear cut.

This was taken on a 10.1.0.5 system (Windows XP Pro)

And this is the script used to create the two tables:

create table nr ( n1 number not null
, n2 number not null
, n3 number not null
, c1 varchar2(6)
, filler varchar2(1000));
create index nr_ix1 on nr(n1,n2);
create index nr_ix2 on nr(n2,n3);
create index nr_ix3 on nr(n3,n1);

create table ch ( n1 varchar2(6) not null
, n2 varchar2(6) not null
, n3 varchar2(6) not null
, c1 varchar2(6)
, filler varchar2(1000));
create index ch_ix1 on ch(n1,n2);
create index ch_ix2 on ch(n2,n3);
create index ch_ix3 on ch(n3,n1);

truncate table nr;
exec dbms_random.seed(1);
insert into nr
select trunc(dbms_random.value(0,400))
, trunc(dbms_random.value(0,500))
, trunc(dbms_random.value(0,1000))
, dbms_random.string('u',6)
, dbms_random.string('a',800)
from dba_objects where rownum <= 12000;
commit;

truncate table ch;
insert into ch select
to_char(n1,'fm000000'),to_char(n2,'fm000000'),to_char(n3,'fm000000'),c1,fill
er
from nr;

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'nr',
estimate_percent => 100
    , block_sample => FALSE, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY',
cascade => TRUE);
  DBMS_STATS.GATHER_TABLE_STATS (ownname => user, tabname => 'ch',
estimate_percent => 100
    , block_sample => FALSE, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY',
cascade => TRUE);
END;
/

as can be seen from the creation, both tables contain exactly the same data
except for the datatype of the three join columns.

The numbers occupy a bit less space, but not enough to affect the overall
size:

                                avg
TABLE_NAME      rows    blks    empty   row
------------------------------  ------------    ----------      ------- ----
--
NR      12,000  3,010   0       819
CH      12,000  3,010   0       829

The average row size is 10 bytes less for the number table. The difference
shows
more in the number of leaf blocks of indexes.

Quoting genegurevich@xxxxxxxxxxxxxxxxxxxxx:

> Hi all:
>
> What I'm trynig to figure out is whether there is a difference in
> performance in table joins by a numeric fields
> versusu the table joins by a varchar field. I don't remember reading
> anything on that matter, but my developers
> think that it is better to join via numeric fields. Does anyone have any
> insight into that? Or a article I can read?
>
--
regards

Wolfgang Breitling
Oracle 7,8,8i,9i OCP DBA
Centrex Consulting Corporation
www.centrexcc.com

--
//www.freelists.org/webpage/oracle-l


--
//www.freelists.org/webpage/oracle-l


Other related posts: