Re: Performance problem: Loading data via insert

  • From: Abhishek Gurung <abhishek.gurung@xxxxxxxxxxx>
  • To: Oracle Freelist <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 14 Mar 2014 14:37:01 +0530




Hi Mark,
Thanks for replying.
Below are my comments respective to your queries:
1)    Do you have any instrumentation in your application that informs you as 
each step completes?a.     for
 example, some form of output that you’re about to execute a create 
table statement before you do that and that it has completed when it has
 before you move on to the next DDLb.    especially
 wondering if maybe sometimes you’re adding a not null column to an 
existing table containing many rows on the “real” system, while the 
existing tables are empty or small on the test system.
Abhishek: Yes, and we have already omitted the time taken to parse XML, 
creating and altering tables and creating insert statements. This all process 
took at most half hour. Only the insert statement is taking about 7-8 hrs.

2)    Where does the file exist with respect to your database server?a.     for
 example if the generated file is remote from the database server, you 
may be unintentionally measuring throughput and latency of a network 
connection.b.    is
 the network connection different when considering the difference 
between your “fast” experience on your test server and the “slow” 
experience for production?3)    Where is your application program executing?a.  
   some program running on the database server?b.    some program running on a 
remote client?                                                    i.     
directly connected to the database server?                                      
            ii.     connecting through some application server or web services 
layer?Abhishek: Everything resides on database server. File, application is 
basically a java program imported in to Oracle database. We have created 
procedure on those Java programs.

4)    Are you generating a list of one row inserts?a.     if
 so, you would probably be better off generating something in a format 
that could either be read as an external table or by loaderb.    if
 so, then you probably don’t want an instrumentation statement before 
and after each insert. I’d start with something like before and after 
each of your DDL statements and then before and after the first insert 
to each different table and then after perhaps 100 rows and after then 
end of a given table. Then at least you’ll know if the delay is with 
respect to some individual DDL or some particular table.Abhishek: Yes. Let me 
see what we can do on this line. Thanks for the suggestion.

5)    If
 you’re using sqlplus, did you set arraysize to some reasonable value in
 your job stream, or are you relying on a glogin file or default that 
might be different on the two servers?Abhishek: I need to check that.

From: mwf@xxxxxxxx
To: abhishek.gurung@xxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Performance problem: Loading data via insert
Date: Fri, 14 Mar 2014 04:42:33 -0400

Okay, so I think you have written some application. When some XML file arrives, 
your application parses the XML file and converts the file into some set of 
insert statements.As the application parses the XML file, it apparently also 
checks for table and column references and there is some process by which you 
discern whether the table and column references already exist.If a table does 
not exist you generate a create table statement and if a column reference does 
not exist you generate an alter table statement to add the column 
reference.Then you use something to execute the sql file (does that mean you 
run sqlplus and @somefile ?) Now, in the previous paragraph there a lot of 
references to “some” that are undefined. They are probably obvious to you, but 
I’d be just guessing (such as the example guess of sqlplus @ for “we execute 
this sql file.”) so I hope you don’t think I’m silly for asking. Speculative 
wondering:1)    Do you have any instrumentation in your application that 
informs you as each step completes?a.     for example, some form of output that 
you’re about to execute a create table statement before you do that and that it 
has completed when it has before you move on to the next DDLb.    especially 
wondering if maybe sometimes you’re adding a not null column to an existing 
table containing many rows on the “real” system, while the existing tables are 
empty or small on the test system.2)    Where does the file exist with respect 
to your database server?a.     for example if the generated file is remote from 
the database server, you may be unintentionally measuring throughput and 
latency of a network connection.b.    is the network connection different when 
considering the difference between your “fast” experience on your test server 
and the “slow” experience for production?3)    Where is your application 
program executing?a.     some program running on the database server?b.    some 
program running on a remote client?                                             
       i.     directly connected to the database server?                        
                          ii.     connecting through some application server or 
web services layer?4)    Are you generating a list of one row inserts?a.     if 
so, you would probably be better off generating something in a format that 
could either be read as an external table or by loaderb.    if so, then you 
probably don’t want an instrumentation statement before and after each insert. 
I’d start with something like before and after each of your DDL statements and 
then before and after the first insert to each different table and then after 
perhaps 100 rows and after then end of a given table. Then at least you’ll know 
if the delay is with respect to some individual DDL or some particular table.5) 
   If you’re using sqlplus, did you set arraysize to some reasonable value in 
your job stream, or are you relying on a glogin file or default that might be 
different on the two servers?  Now it is indeed possible there is something 
“broken” about your database, but that should show up in things like the alert 
log and/or performance monitoring and/or lock reports on the 
database.Instrumenting your application, even just for the elapsed time of 
major steps, should give you a pretty good idea where to look next. Good luck, 
mwf  From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] 
On Behalf Of Abhishek Gurung
Sent: Friday, March 14, 2014 3:16 AM
To: Oracle Freelist
Subject: RE: Performance problem: Loading data via insert Hi Veerabasai​ah, 
Thanks for replying.What we do is data comes in XML files we parse them convert 
them into insert statement and load those data intotables. Each of these file 
contains data for multiple tables. before inserting data we also Create and 
alter tables in database,if we find any new tables or columns in XML file. Then 
we execute this sql file containing insert statements to loaddata into 
different tables.The loading process for all the databases instances start 
almost at same time(as soon as they receive the xml file).We have even try to 
load the files while no other database loading was going on but still the 
loading is taking same time. So we conclude that there is something wrong with 
this database instance but unable to identify it. RegardsAbhishek Date: Fri, 14 
Mar 2014 10:01:25 +1300
Subject: Re: Performance problem: Loading data via insert
From: veeracb@xxxxxxxxx
To: sameer.choudhari@xxxxxxxxx
CC: carlos.sierra.usa@xxxxxxxxx; abhishek.gurung@xxxxxxxxxxx; 
oracle-l@freelists.orgDo you always load the data or cleanup some old data 
before inserting ?
What else is running at the same time the load is running, is something 
blocking this load ? As Carlos suggested, it would be better if you get more 
data around the problem and try identify where your time is going. On Fri, Mar 
14, 2014 at 9:42 AM, Sameer Choudhari <sameer.choudhari@xxxxxxxxx> wrote:Try 
tuning your insert statements with hints like PARALLEL NOLOGGING . 

On Friday, 14 March 2014, Carlos Sierra <carlos.sierra.usa@xxxxxxxxx> 
wrote:Abhishek, As always, start with the evidence:What facts have you 
collected? What is AWR telling you? What are the SQL Traces telling? Which 
other diagnostics tools have you used for your case?
Cheers,

Carlos Sierra
blog: carlos-sierra.net
twitter: @csierra_usa

Life's Good! On Mar 13, 2014, at 6:52 AM, Abhishek Gurung 
<abhishek.gurung@xxxxxxxxxxx> wrote: Hi

We are facing a performance problem but unable to identify where is the problem.

We have a production database server where about 8 database instances of oracle 
is running.
Daily we insert in each of these database instances about 1-5 GB of data in 
different tables.
All database instances are working fine where we are able to load around 2-3 GB 
of data within 1hour except 
1 where it is taking about 8 hrs to load only 4 GB of data.

When I try to load the same 4GB of data in another database where only one 
instance is running and with
very less configuration as compared to the Production server we were able to 
load it in 1hr 45 minutes.

Can anyone suggest how can we proceed to identify the problem?

Regards
Abhishek  -- 
Sent from Google Nexus

-- Veerabasaiah C B"Only put off until tomorrow what you are willing to die 
having left undone. - Picasso"                                                  
                              

Other related posts: