Large table query with cpu throttling

  • From: "Peter Teoh" <htmldeveloper@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 18 Feb 2008 14:01:34 +0800

My requirements:

1.   I have a large table, in the range of 1 billion records.

2.   But the machine is a relatively slow one.   Therefore I would not
like to clog the machine when querying this large table.

My question:

A.   Supposed I need to extract two fields of ALL the record - but the
full table scan can afford to take its time.   Therefore, how do I do
a full table scan, with cpu throttling (eg, sleeping or slowing down
once in a while) so as to let other jobs continue?   (any SQL hints
available?)

Ideally I thought it should not involved any indexes, as that will
incur additional lookups, plus possibly some sorting etc, because the
target is a FTS anyway - is my analysis correct?

B.   Supposed there are no solution to the above, so alternatively is
to do slicing.   My method is find the min and max of the rowid, and
then do a direct rowid access:

select * from table where rowid = 'xxxx'.

where the xxxx correspond to the rowid automatically generated.  This
statement does not generate any temporary sorting, or depending on any
prior indexes.

To do this, I did a simple experiment:

select rowid from a_large_table;

and the character range of the rowid seemed to be (in this order):
[A-Za-z0-9+/] and that's all, and a total of 17 characters.   This is
verified in Ora10gR2, and Ora9iR2 (both Linux).   Are my analysis
correct?   Any other documentation have u seen with these info?

Therefore my plan is to construct the SQL with the rowid following the
above format, slowly increasing, so as directly retrieve the data with minimal
performance hits on the server.   Is this the best way of slicing the
table so as to do a full table scan while still meeting the
requirements above?

C.   Currently, I found that select rowid from a_large_table where
rownum < 100 and rowid > 'xxxx' return almost immediately, whereas a
"select count(*)" can take about 20mins or more to return just a
number.

This therefore is the 3rd way to get data - 100 records at a time.
Similar in design to (2) above.   But because the last row's rowid is
always the largest of all the 100 rows, therefore, i will use that
value as the input to the next query, to get the next 100 rows, and so
on.   So this way seem not to incur any performance hits at all, but
it has the assumption that "select" statement ALWAYS RETURN THE ROWID
in an incremental manner, if just rownum and rowid are used in the
where clause.    Is this assumption reasonable?

Thanks.
--
//www.freelists.org/webpage/oracle-l


Other related posts: