Fasted Way to Unload Data - Straight SQL query vs. PL/SQL and dbms_output vs. PL/SQL and utl_file

  • From: Mark Strickland <strickland.mark@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 29 Sep 2010 09:49:18 -0700

Oracle 10gR2 on Windows 2008R2 with Cygwin and Bash Scripts, 4 CPUs and 4-GB
RAM

I've written a horrible set of ETL bash scripts for a customer project and
the final script does the "transforms".  It's a set of 14 queries that each
join 10-12 tables and write the output in pipe-delimited format.  Lines are
up to 1000 bytes wide.  I've set DOP on the tables to 16 in order to get
more PGA for hash joins.  The queries themselves seems to run reasonably
fast (I've tuned them as well as I can, I think) but the output is the
problem.  I have to concatentate '|' between the fields because the next
step in the process requires that.  Straight SQL output can write about
775-KB per second to disk.  I got a PL/SQL procedure from Laurent
Schneider's blog that uses dbms_output and, once it starts writing, it
writes about 3.5-MB per second, a vast improvement.  However between the
time when the query heavy lifting "appears" to have completed (buffer gets
stop increasing, TEMP tablespace writes drop to nothing, USERS tablespace
reads drop to nothing) and when the output starts writing, several hours
pass.  I'm assuming that it is buffering all the output (about 8-Gb) before
it starts writing.  I see a lot of writes to the Windows pagefile, but
little or no writes to the TEMP tablespace.  There certainly isn't enough
memory to buffer 8-GB.  I will also try the utl_file approach.  So, I'm
looking for at least anecdotal evidence for the best approach here.  I'm
under a very tight deadline and am doing as much of my own testing as I can
get done, but I'm hoping that others in the list will know right off how to
approach this the best way.  Thanks in advance.

Mark Strickland
Seattle, WA
'

Other related posts: