Re: seconds_in_wait

  • From: Jay Hostetter <hostetter.jay@xxxxxxxxx>
  • To: "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 22 Sep 2015 13:18:55 -0400

Thank you for the responses. I have this issue in two databases. I
bounced one, and now it is fine. I have not yet bounced the other, and my
query continues to catch the issue. Now for testing in this database, per
Stefan's suggestion:

SQL> run
1 select s.indx, w.kslwtstime, w.kslwtltime,
decode(w.kslwtinwait,0,round((w.kslwtstime+w.kslwtltime)/1000000),
2 round(w.kslwtstime/1000000)) as SEC, e.kslednam
3 from x$ksuse s, x$ksled e, x$kslwt w
4* where bitand(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and
s.indx=w.kslwtsid and w.kslwtevt=e.indx and s.indx =47

INDX KSLWTSTIME KSLWTLTIME SEC KSLEDNAM
---------- ---------- ---------- ----------
----------------------------------------------------------------
47 121597780 0 122 PL/SQL lock timer


So it looks like the wait time is not unreasonable for this session that is
calling dbms_lock.sleep.

SQL> oradebug setmypid
oradebug dumpvar sga ksudbrmseccnt
ub4 ksudbrmseccnt_ [06000ABA0, 06000ABA4) = 56017B56

in decimal: 1442937686
epoch time: 4:01:26 pm UTC Tuesday 9/22/15

And the time seems correct.

strace on vktm shows no gettimeofday calls.

I'll continue to poke around.

Thank you,
Jay

On Mon, Sep 21, 2015 at 12:22 PM, Stefan Koehler <contact@xxxxxxxx> wrote:

Hi Jay,
well at first you may need to know how seconds_in_wait is calculated.

The column seconds_in_wait basically contains the start time of the wait
event <X> in a memory location and when you query the view gv$session it is
calculated as "SGA_TIME - SECONDS_IN_WAIT". The (current) SGA time is
maintained by process VKTM in Oracle 11g and by LGWR up to Oracle 10g. So
basically things can go wrong by storing the starting time or the SGA
time, but i would expect the SGA time in this case.

Demo case

==================================================================================================================================
TEST@T11DB:15> exec DBMS_LOCK.SLEEP(500);

SYS@T11DB:177> select s.indx, w.kslwtstime, w.kslwtltime,
decode(w.kslwtinwait,0,round((w.kslwtstime+w.kslwtltime)/1000000),
round(w.kslwtstime/1000000)) as SEC, e.kslednam
from x$ksuse s, x$ksled e, x$kslwt w
where bitand(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and
s.indx=w.kslwtsid and w.kslwtevt=e.indx and s.indx = 15;
INDX KSLWTSTIME KSLWTLTIME SEC KSLEDNAM
---------- ---------- ---------- ----------
----------------------------------------------------------------
15 17367730 0 17 PL/SQL lock timer

SYS@T11DB:99> oradebug dumpvar sga ksudbrmseccnt
ub4 ksudbrmseccnt_ [06000ABA0, 06000ABA4) = 56002B2B

Hex 56002B2B = Dec 1442851627 = Epoch GMT: Mon, 21 Sep 2015 16:07:07 GMT

T11DB [oracle@OEL trace]$ ps -ef | grep 2111
oracle 2111 1 2 17:17 ? 00:01:31 ora_vktm_T11DB

[root@OEL ~]# strace -p 2111
Process 2111 attached - interrupt to quit
gettimeofday({1442852011, 432577}, NULL) = 0
gettimeofday({1442852011, 433533}, NULL) = 0
nanosleep({0, 1000000}, 0x7fff9be4e0d0) = 0
gettimeofday({1442852011, 436632}, NULL) = 0
gettimeofday({1442852011, 437370}, NULL) = 0


Dec 1442852011 = Epoch GMT: Mon, 21 Sep 2015 16:13:31 GMT

==================================================================================================================================

Please dump the SGA time or strace vktm and crosscheck the epoch date, if
you hit this issue again. Maybe your OS time (syscall gettimeofday()) flips
and returns a wrong value.

Best Regards
Stefan Koehler

Freelance Oracle performance consultant and researcher
Homepage: http://www.soocs.de
Twitter: @OracleSK

Jay Hostetter <hostetter.jay@xxxxxxxxx> hat am 21. September 2015 um
14:44 geschrieben:

(reposting with the correct email address )

I have a query that monitors for sessions that are "stuck" on non-idle
wait events.

select username, sid, event, seconds_in_wait
from gv$session
where wait_class != 'Idle'
and seconds_in_wait > 300

I've discovered that over time, this query will catch the the LGWR
session with a seconds_in_wait value that is extremely high for "log file
parallel write" (e.g.1442837632). There is no way that the log writer
is waiting this long. I've been trying to search for any known bugs related
to wait events. I just wanted to check with the list to see if anyone
has ever run into this issue or has any insight. The query seems to be
fine,
In this particular case, the database has been up for only 45 days.

Oracle EE 11.2.0.3.15 on SUSE Linux 11.

Thank you,
Jay

Other related posts: