Re: Fasted Way to Unload Data - Straight SQL query vs. PL/SQL and dbms_output vs. PL/SQL and utl_file

  • From: Mark Strickland <strickland.mark@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 29 Sep 2010 13:29:10 -0700

Thanks to all who replied.  I've tested 1) writing to a GTT then selecting
from that, 2) PL/SQL with dbms_output, and 3) PL/SQL with utl_file.  I'm not
able to get the output any faster.  Another component to the slowness is
that I have to set NLS_LANG=AMERICAN_AMERICA.AL32UTF because the data
contains unicode characters.  That with concatenating pipes to columns are
slowing down the output.  I failed to mention that third-party tools, C
programs and perl scripts are out of the question because my stuff will be
turned over to the customer and they must be able to support it.  They won't
be able to support perl or compiled 3GB programs and they won't pay for a
third-party tool.  So, I'm back to 775-KB per second.

Mark



On Wed, Sep 29, 2010 at 9:49 AM, Mark Strickland
<strickland.mark@xxxxxxxxx>wrote:

> Oracle 10gR2 on Windows 2008R2 with Cygwin and Bash Scripts, 4 CPUs and
> 4-GB RAM
>
> I've written a horrible set of ETL bash scripts for a customer project and
> the final script does the "transforms".  It's a set of 14 queries that each
> join 10-12 tables and write the output in pipe-delimited format.  Lines are
> up to 1000 bytes wide.  I've set DOP on the tables to 16 in order to get
> more PGA for hash joins.  The queries themselves seems to run reasonably
> fast (I've tuned them as well as I can, I think) but the output is the
> problem.  I have to concatentate '|' between the fields because the next
> step in the process requires that.  Straight SQL output can write about
> 775-KB per second to disk.  I got a PL/SQL procedure from Laurent
> Schneider's blog that uses dbms_output and, once it starts writing, it
> writes about 3.5-MB per second, a vast improvement.  However between the
> time when the query heavy lifting "appears" to have completed (buffer gets
> stop increasing, TEMP tablespace writes drop to nothing, USERS tablespace
> reads drop to nothing) and when the output starts writing, several hours
> pass.  I'm assuming that it is buffering all the output (about 8-Gb) before
> it starts writing.  I see a lot of writes to the Windows pagefile, but
> little or no writes to the TEMP tablespace.  There certainly isn't enough
> memory to buffer 8-GB.  I will also try the utl_file approach.  So, I'm
> looking for at least anecdotal evidence for the best approach here.  I'm
> under a very tight deadline and am doing as much of my own testing as I can
> get done, but I'm hoping that others in the list will know right off how to
> approach this the best way.  Thanks in advance.
>
> Mark Strickland
> Seattle, WA
> '
>

Other related posts: