RE: Fasted Way to Unload Data - Straight SQL query vs. PL/SQL and dbms_output vs. PL/SQL and utl_file

  • From: "Goulet, Richard" <Richard.Goulet@xxxxxxxxxxx>
  • To: <strickland.mark@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 29 Sep 2010 16:33:01 -0400

Mark,
 
    Try setting termout off.  Since there is only one output statement
to your txt file it will run faster.
 

Dick Goulet 
Senior Oracle DBA/NA Team Lead 
PAREXEL International 

 

________________________________

From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Mark Strickland
Sent: Wednesday, September 29, 2010 4:29 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Fasted Way to Unload Data - Straight SQL query vs. PL/SQL
and dbms_output vs. PL/SQL and utl_file


Thanks to all who replied.  I've tested 1) writing to a GTT then
selecting from that, 2) PL/SQL with dbms_output, and 3) PL/SQL with
utl_file.  I'm not able to get the output any faster.  Another component
to the slowness is that I have to set NLS_LANG=AMERICAN_AMERICA.AL32UTF
because the data contains unicode characters.  That with concatenating
pipes to columns are slowing down the output.  I failed to mention that
third-party tools, C programs and perl scripts are out of the question
because my stuff will be turned over to the customer and they must be
able to support it.  They won't be able to support perl or compiled 3GB
programs and they won't pay for a third-party tool.  So, I'm back to
775-KB per second.  

Mark




On Wed, Sep 29, 2010 at 9:49 AM, Mark Strickland
<strickland.mark@xxxxxxxxx> wrote:


        Oracle 10gR2 on Windows 2008R2 with Cygwin and Bash Scripts, 4
CPUs and 4-GB RAM
        
        I've written a horrible set of ETL bash scripts for a customer
project and the final script does the "transforms".  It's a set of 14
queries that each join 10-12 tables and write the output in
pipe-delimited format.  Lines are up to 1000 bytes wide.  I've set DOP
on the tables to 16 in order to get more PGA for hash joins.  The
queries themselves seems to run reasonably fast (I've tuned them as well
as I can, I think) but the output is the problem.  I have to
concatentate '|' between the fields because the next step in the process
requires that.  Straight SQL output can write about 775-KB per second to
disk.  I got a PL/SQL procedure from Laurent Schneider's blog that uses
dbms_output and, once it starts writing, it writes about 3.5-MB per
second, a vast improvement.  However between the time when the query
heavy lifting "appears" to have completed (buffer gets stop increasing,
TEMP tablespace writes drop to nothing, USERS tablespace reads drop to
nothing) and when the output starts writing, several hours pass.  I'm
assuming that it is buffering all the output (about 8-Gb) before it
starts writing.  I see a lot of writes to the Windows pagefile, but
little or no writes to the TEMP tablespace.  There certainly isn't
enough memory to buffer 8-GB.  I will also try the utl_file approach.
So, I'm looking for at least anecdotal evidence for the best approach
here.  I'm under a very tight deadline and am doing as much of my own
testing as I can get done, but I'm hoping that others in the list will
know right off how to approach this the best way.  Thanks in advance.
        
        Mark Strickland
        Seattle, WA
        '
        


Other related posts: