Re: Stuck session help

  • From: Tim Johnston <tjohnston@xxxxxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Wed, 07 Apr 2004 19:11:03 -0400

Hi Dan...

  Ok...  I ran into an issue on 8.1.7 last year that sounded similar to 
what you encountered...  The end result was an bug in the JDBC 
driver...  Even if you're on a newer release, I'd give this situation a 
look if you are using JDBC...  If you're not using Now, on to the problem...

Ok...  The problem is JDBC sends a null packet to the server...  The 
server expects that the client will send more information (hence the 
more data from client) while the application is waiting on a return from 
the server...  Hence the hang...  Both sides are waiting on the other 
one...  It's somewhat unpredictable since the packet has to be "just 
right" in size for this to happen...

Bug number 1956339 has a nice summary of the situation...  This bug 
discusses a special situation with a Japanese characterset but it can 
happen in other situations...  I included the internal discussion of bug 
1956339 below...  The interesting bit is:


"This null packet received by the server makes the server to go back and 
read the next packet (NOTE: Null packet had is used for getting server 
info, this is not what JDBC is intending to do). Since JDBC is also done 
with the write it waits for read from the server and server waits for 
read from JDBC leading to a hang (deadlock)."


Cool uh?  Another bug that discussing something similar is 1483556...  
It was a 7.3.4 bug listed as fixed in 9 and backported to 8.1.7.1...  
This talks about how to track it down with a sql*net trace...


"How to determine rediscovery:

The key here is that the thin driver was sending null SQL*Net packets to 
the  server. 
This can be seen in the SQL*Net trace file by looking for lines  similar 
to the following: .

nsprecv: 10 bytes from transport
nsprecv: tlen=10, plen=10, type=6
nsprecv:packet dump
nsprecv:00 0A 00 00 06 00 00 00  |........|
nsprecv:00 00 00 00 00 00 00 00  |........|
nsprecv: normal exit
nsrdr: got NSPTDA packet
nsrdr: NSPTDA flags: 0x0
nsrdr: normal exit
nsdo: got "null" packet

]]Executing batch operations against a 7.3.4 server that have a large 
amount
]]of bind variables, and include some NULL values, could hang when using 
the
]]thin driver."


FYI..  In case you're curious, here is the full text of the internal 
description of bug 1956339


Internal Description:

In a few places we have 'outBuffer.write(tmpBuffer, 0, bytes);' where

tmpBuffer is a byte array and 0 is the offset and bytes is number of bytes

to be written. When the number of bytes to be written is 0 we need not

write anything to the lower layer to avoid roundtrips. Thus avoiding null

packets to be written.

.

Consider the scenario when the number of bytes in the buffer (DataPacket)

is 2047 and marshallSB2 is called,

+ 2 bytes are allocated

+ The two bytes are passed to value2Buffer for splitting the value into a

byte array, and for any conversion if needed (i.e LSB)

In the UNIVERSAL case, it adds the number of data bytes to the buffer,

before the actual value.

+ here if the value which is sent to value2Buffer is UNIVERSAL an extra

byte is written into the buffer making the full (isBufferFull = true)

+ After this when write is called with the number of bytes to be written

as 0, the null packet is sent across.

.

This null packet received by the server makes the server to go back and

read the next packet (NOTE: Null packet had is used for getting server

info, this is not what JDBC is intending to do). Since JDBC is also done

with the write it waits for read from the server and server waits for read

from JDBC leading to a hang (deadlock).

.

As you might have already realised the problem lies both in JDBC and

JavaNet which are as follows,

JavaNet:=

Update the status of isBufferFilled, even if the number of bytes to be

written is 0 (in putDataInBuffer). Since putDataInBuffer is not invoked

everytime the write is called thus leading to stale isBufferFilled (which

is wrong).

JDBC:=

Do not make a roundtrip to JavaNet when the number of bytes to be written

is 0.

.

The JDBC fix does fix the bug but I will be filing a new bug against

JavaNet to make sure that the bug in their code is also fixed.

.

The filed JavaNet bug number is bug 2069280.

.

Resolution:

Do not write null packets or do not invoke write on JavaNet when you have

0 bytes to write.


HTH
Tim

Daniel Fink wrote:

>We've got an application that is having problems with the
>processing becoming 'stuck'. I'm now stuck trying to figure out
>why. The process is a series of inserts into several tables.
>Quite a few of the inserts succeed, eventually one of them
>sticks.
>
>It sticks before the execute phase of the insert. It is not
>waiting on a lock (but blocking others). There is plenty of free
>space in the datafile.
>
>Any clues? Ideas?
>
>
>  
>
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: