Is it reproducible ? If I had to guess I'd look at network routing - e.g. did the session start trying to use the wrong network address on each fetch, timeout, and fall back to the correct address ? Regards Jonathan Lewis http://jonathanlewis.wordpress.com @jloracle ________________________________ From: Rich [richa03@xxxxxxxxx] Sent: 24 September 2014 21:23 To: Jonathan Lewis Cc: Oracle-L Freelists Subject: Re: 10046 L12 - Very long ela for 'SQL*Net message from client'? Thanks, Matthew and Jonathan. The autorace and 10046 are from separate sessions - I let the AT run overnight and killed the 10046 part way through. The client (for both) is me on the DB machine at a SQL Plus prompt with a dedicated connection. I'm starting to suspect the OS...but where? On Wed, Sep 24, 2014 at 1:03 PM, Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx<mailto:jonathan@xxxxxxxxxxxxxxxxxx>> wrote: Something's not consistent here. You have an array size of 15, a wait of 16 seconds between fetches, and 207,906 rows fetched - which should total about 2.5 days to complete ! How many of the waits were about 16 seconds, did they all appear in a single batch. When you ran autotrace was the client elapsed time consistent with the server trace time ? What did the tkprof output give as the "time in the trace file". Regards Jonathan Lewis http://jonathanlewis.wordpress.com @jloracle ________________________________ From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx> [oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx>] on behalf of Rich [richa03@xxxxxxxxx<mailto:richa03@xxxxxxxxx>] Sent: 24 September 2014 17:38 To: Oracle-L Freelists Subject: Fwd: 10046 L12 - Very long ela for 'SQL*Net message from client'? Hi List, This is 11.1.0.7 on Oracle Linux 5 U3 (Enterprise Linux Server release 5.3 (Carthage) (kernel 2.6.18-128.el5)) rpm -qf /etc/redhat-release enterprise-release-5-0.0.17 (I know the version is “very” old, but that’s where we are right now.) I seem to remember a post by Tanel (or was it Kyle?) a while back about this/these types of waits (SQL*Net), however, my Google fu is not working for me right now. Snip from 10046 level 12 trace from a dedicated SQL*Plus session on the DB server: … *** 2014-09-24 08:03:20.151 WAIT #2: nam='SQL*Net message from client' ela= 16701919 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571000151138 WAIT #2: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571000151302 FETCH #2:c=1000,e=0,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=133634354,tim=1411571000143712 *** 2014-09-24 08:03:36.853 WAIT #2: nam='SQL*Net message from client' ela= 16702053 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571016853884 WAIT #2: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571016854041 FETCH #2:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=133634354,tim=1411571016838765 *** 2014-09-24 08:03:53.541 WAIT #2: nam='SQL*Net message from client' ela= 16686602 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571033541263 WAIT #2: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=1237502 tim=1411571033541424 FETCH #2:c=1000,e=0,p=0,cr=0,cu=0,mis=0,r=15,dep=0,og=1,plh=133634354,tim=1411571033533874 … From client is consistently ~16sec! Autotrace of the statement looks like: … 207963 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 133634354 ----------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5759 | 2991K| | 16679 (1)| 00:03:21 | |* 1 | TABLE ACCESS FULL | PFM_AUDIT_LOG_HDR | 1 | 22 | | 309 (1)| 00:00:04 | | 2 | SORT ORDER BY | | 5759 | 2991K| 9224K| 16679 (1)| 00:03:21 | |* 3 | HASH JOIN | | 5759 | 2991K| | 16026 (1)| 00:03:13 | |* 4 | INDEX FAST FULL SCAN | PFM_COMPANY_AI1 | 331 | 11916 | | 60 (2)| 00:00:01 | |* 5 | TABLE ACCESS BY INDEX ROWID| PFM_POS | 17428 | 8441K| | 15965 (1)| 00:03:12 | |* 6 | INDEX RANGE SCAN | PFM_POS_AI1 | 18366 | | | 8013 (1)| 00:01:37 | ----------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("MODEL_TYPE_ID"='POS' AND TO_NUMBER("DOCUMENT_OID")=:B1) 3 - access("POS"."COMPANY_ENSOID"="VER_ENSOID") 4 - filter(UPPER("NAME") LIKE 'SOME STRING' ESCAPE AND "VER_LATEST_VEROID" IS NOT NULL) 5 - filter("POS"."TRANS_SUBTYPE_ID"='4') 6 - access("POS"."TRANS_END_DATE">=TO_DATE(' 2011-04-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "POS"."TRANS_END_DATE"<=TO_DATE(' 2014-03-31 23:59:59', 'syyyy-mm-dd hh24:mi:ss')) filter(UPPER(TO_CHAR("POS"."HIDDEN")) LIKE '0' ESCAPE AND "POS"."VER_LATEST_VEROID" IS NOT NULL) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 264266641 consistent gets – I know it might be better to just FTS the POS table rather than use the index, however, I’m trying to understand why the long ‘SQL*Net message from client’ waits 0 physical reads 0 redo size 45412064 bytes sent via SQL*Net to client – this is “only” ~43MB 157869 bytes received via SQL*Net from client - ~150KB 13866 SQL*Net roundtrips to/from client – might be able to reduce this with arraysize (currently 15) and/or SDU/TDU/IO Buffers (currently default AFAIK), but, still why the long ‘SQL*Net message from client’ waits 1 sorts (memory) 0 sorts (disk) 207963 rows processed The POC table is very wide (533 columns with type spread of 237 number, 17 date and 213 varchar2 – no LOB) - AVG_ROW_LEN is 496 from DBA_TABLES (last analyzed 15SEP2014). The PFM_AUDIT_LOG_HDR table is very small (1126 blocks from DBA_TABLES – last analyzed 22SEP2014). We could do a SQL Net trace, however, on this version, I think it’d be have to be for all traffic which is not doable in the current environment… Also, there’s no strace on production (where I am right now). Might be able to try to repro in DEV, however, it might be difficult to get time from them… Any help would be greatly appreciated. Thanks, Rich