Optimisation of mass INSERTs

From: Anton Shepelev <anton.txt@xxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Date: Sat, 7 Nov 2020 00:45:10 +0300

Hello, all.

I am working on a program that should effect transactional
upload of massive amounts of data into Oracle tables using
managed ODP.NET driver. Upon initial experiments, I have
settled on transferring data in a series of "batches", each
of which is a parametrised INSERT ALL statement of the form:

INSERT ALL
   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)
   VALUES( :val_0_0, :val_0_1, :val_0_2, :val_0_3, :val_0_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)
   VALUES( :val_1_0, :val_1_1, :val_1_2, :val_1_3, :val_1_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)
   VALUES( :val_2_0, :val_2_1, :val_2_2, :val_2_3, :val_2_4 )

   INTO TESTTAB( TEXTDATA1, TEXTDATA2, DECIMALDATA, DOUBLE)
   VALUES( :val_3_0, :val_3_1, :val_3_2, :val_3_3, :val_3_4 )
SELECT * FROM DUAL

with a customisable number of rows. My tests show that it is
at least 1.5 times more efficient than a batch of equivalent
INSERT statements inside a BEGIN..END block. The performance
of this method on the client's site manifests two unusual
tendencies, which I can't explain. I suppose them connected
somehow with the length of the network route between client
and server, which takes 36 ms to ping, but am still at a
loss as to the exact explanation.

The first anomaly is the effect of batch size. For each
target table and dataset, there appears to be a single
optimal number of rows in a batch, so that both smaller and
larger batches work considerably slower. Performace
measurements for two tables with various batch sizes are
shown below, the respective optimums put in brackets (view
in a fixed-width font):

                Table A             Table B
           +----------------+  +----------------+
           |Rows: 100 000   |  |Rows: 6000      |
           |Cols: 20        |  |Cols: 60        |
           +----------------+  +----------------+
           | batch  time, s |  | batch  time, s |
           |   16     252   |  |    8      54   |
           |   64      65   |  |   16      36   |
           |  256      30   |  |  [32]     18   |
           | [384]     26   |  |   64      19   |
           |  512      31   |  |  128      43   |
           | 1024      90   |  |  256     318   |
           +----------------+  +----------------+

I can understand why smaller batches are inefficent: it must
be due to the overhead of frequent call-and-response between
client and server, but why performace starts to degrade
again as batch size increases past the optimum?

The second anomaly occurs when the network becomes unstable.
During these intervals, which occur once or twice a week,
`ping' shows that a considerable portion of packets are lost
and the percentage of lost packets grows with packet size.
At these times, the general pattern with a single optimum
remains, but the optimal batch size is much lower than when
the network is feeling well. Again, I have no idea what may
be causing this.

I conjecture that it is a case of leaky abstractions[1]:
since the ODP.NET driver cannot entirely abstract the
ADO.NET interface from the gritty details of network data
transfer, batch size affects both the size and the number of
network packets required to send one batch, which in turn
has a tremendous effect on throughput. But what can I do
about it? If my conjecture be right, I have to optimise the
serialised size of the batch, rather than the number of rows
in it, and I cannot reliably to that because the
serialisation algorithm is not officially documented.

I have considered prepared statments, which should save
about 60% of traffic, but Oracle have chosen not to
implement the .Prepare() method provided by ADO.NET. The
official documentation describes the implementation as "no-
op":

   ODP.NET: OracleCommand Public Methods:
   https://docs.oracle.com/database/121/ODPNT/OracleCommandClass.htm#g1099930

It is noteworthy that JDBC, on the contrary, does support
parametrised statements:

   JDBC: Using Prepared Statements
   https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html

In spite of what some experts say about their obsolescense
due to improved server-side caching of compiled statements,
prepared statements remain superior because they save
network traffic by letting the client send the statement id
instead of the entire SQL text. I have not yet tried
preparing them manually via PL/SQL as descrbied here:

   EXECUTE and PREPARE:
   https://docs.oracle.com/cd/B10501_01/appdev.920/a97269/pc_13dyn.htm#865

but in any case it can only improve performace by a constant
factor, but cannot remove the need of optimising batch size
for each table, data sturcutre, and network condition.

I have also tried a dynamic optimiser that measured
performance in real-time and adjusted batch size
accordingly, but it was not very efficient because I do not
know how to predict the serialised batch size, in terms of
bytes actually sent over the network.

While reading performace-related documention, I came upon
the TCP.NODELAY setting:

   TCP.NODELAY
   https://docs.oracle.com/cd/E18283_01/network.112/e10835/sqlnet.htm#CIHCAJGA

Is that article correct in that it is `yes' by default? It
should seem strange, because all the commentor about this
setting discuss enabling it as very non-standard, e.g. Don
Burleson writes:

,----[http://www.dba-oracle.com/art_builder_tns.htm:]
| Oracle Net, by default, waits until the buffer is filled
| before transmitting data. Therefore, requests aren't
| always sent immediately to their destinations. This is
| most common when large amounts of data are streamed from
| one end to another, and Oracle Net does not transmit the
| packet until the buffer is full.
`---------------------------------

Who is correct, and how can I test the actual default value?
Do you think TCP.NODELAY could optimise massive data
transfers? I am asking this because I can't easily test at
client's site because of bureaucracy, while local testing is
nearly useless in our fast local network.

Antoher article, about TCP/IP optimisation in general,
mentions the TCP_ACK setting in connexion with TCP_NODELAY:

   Best Practices for TCP Optimization in 2019:

https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle-quickack-best-practices/

Can it, in your opinion, affect massive data transfers, and
if it can, then how to set it for Oracle connections?

For your reference, I already asked this question on DBA
StackExchange, but it drew no replies, and does not seem to
be going to:

   Anomalous performance of mass INSERT in Oracle:

https://dba.stackexchange.com/questions/278185/anomalous-performance-of-mass-insert-in-oracle

Can you please help me diagnose and explain these anomalies,
and possibly solve the problem of unstable peformance of
mass insertions in Oracle? What monitoring tools can I use
to determine the bottleneck?
____________________
1. https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/

--
()  ascii ribbon campaign -- against html e-mail
/\  http://preview.tinyurl.com/qcy6mjc [archived]
--
//www.freelists.org/webpage/oracle-l

Follow-Ups:
- Re: Optimisation of mass INSERTs
  - From: Jonathan Lewis
- RE: Optimisation of mass INSERTs
  - From: Mark W. Farnham

Optimisation of mass INSERTs

Other related posts: