Where is AR_AUTOREM_INTERIM stored, and what is the partitioning definition?
IF that statement is also the one that gets stuck with TS contention on temp
the plan for running just that statement by hand (and then rolling it back)
could be interesting.
IF BATCH_ID is not the partitioning AND every unique BATCH_ID appears in just
one partition, then a stop step updating a column group of the partitioning
column and BATCH_ID might give the optimizer the chance to run the delete as a
very simple single partition plan.
IF BATCH_ID IS the partitioning, it begs the question why a truncate partition
is not generated (since EBS is so complicated it might not be provably correct
that truncating a partition is less contention prone and less work than a
delete).
I’m wondering what hang analyze has to do with the situation you have
described… It’s not really a “hang” and you’re going to need to get the user(s)
who are cancelling it to call you when they think it has run too long instead
of cancelling it.
Very likely a temporary tablespace group will help overall even if it does not
solve this particular problem. That’s a “statmux” improvement on the contention
issue even though it does not directly isolate space allocation for a given
process.
An analysis of the job logs *MIGHT* show some particular other job(s) that
is(are) always running concurrently with the failures (still user cancellation,
right?). Is the user cancelling a concurrent manager job or an interactive job?
IF this is the concurrent manager AND certain other jobs are indicted
coincidentally with cancellations, maybe those jobs should be configured to be
in conflict in the scheduler and wait for each other to start.
Good luck.
mwf
From: Hameed, Amir [mailto:amir.hameed@xxxxxxxxxxxxxxx] ;
Sent: Saturday, November 18, 2023 5:15 PM
To: mwf@xxxxxxxx; jlewisoracle@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: enq: TS - contention (solved, or not?)
Thank you, Jonathan and Mark, for your input and suggestions. I will look into
implementing the Temporary Tablespace Group solution. What I know for now is:
* I opened an SR with Oracle, and after providing all the information
(TFA, AWR, and whatnot), the engineer suggested running Hang Analyze the next
time the issue appeared. There was no other suggestion from the engineer.
* ASH data showed that when job A finished successfully, it ran just the
following statement. The table mentioned is a partitioned table with 128
partitions. This table doesn’t have any indexes.
DELETE FROM AR_AUTOREM_INTERIM WHERE BATCH_ID = :B1
Thanks
From: oracle-l-bounce@xxxxxxxxxxxxx <oracle-l-bounce@xxxxxxxxxxxxx> On Behalf
Of Mark W. Farnham
Sent: Tuesday, November 14, 2023 12:09 PM
To: jlewisoracle@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: enq: TS - contention (solved, or not?)
IF they still use multiple logon names, multiple temporary tablespaces with
different names from TEMP is another option (which could be nicely tuned for
site load and job schedule if different users are in conflict.) Losing that
trivial ability was a downside for running all jobs from “APPS.”
IF this is a mixed EBS and other things database, then even if the EBS is a
single logon name for job execution, either the EBS logon name (or all the
other users, including folks who report on EBS things that use “TEMP” but are
different from the logon for the jobs) can be moved to a different “TEMP.”
Quite often lots of things use “TEMP” other than EBS, so this may be a useful
thing even if it is only a bifurcation.
IF there is a unified default logon name, then constructing multiple temporary
tablespaces with all the appropriate grants is possible, but quite a hassle.
I definitely like your idea of giving a tablespace group a try. Unless there
are multiple logon names for the jobs, that is the least monkeying around. It
might even be less monkeying around than different default temporary
tablespaces with multiple logon names.
As far as I know, they never implemented the enhancement request for “use
‘THIS’ tablespace as the temp for this session” and temporary tablespace groups
was the “general” solution that quashed that user control oriented solution to
this and similar contention (as well as direction to SSD, when SSD was usually
a small portion of the available media farm.)
Altering default temporary tablespace for a user, submitting a job immediately
(which required keeping an open job slot), and then altering the default
temporary tablespace back as another job also submitted to an open slot after a
pause long enough for the first job to have used the altered temporary
tablespace name) was a thing for a while that could make a big difference for
critical jobs. This isn’t i/o rate, but that should also work for contention.
I think I remember that correctly, but I’m not going to look it up. Oracle was
never enthused about going to that level of engineering because it was
anti-marketed as “complicated.”
mwf
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Jonathan Lewis
Sent: Tuesday, November 14, 2023 10:20 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: [EXTERNAL] Re: enq: TS - contention (solved, or not?)
Mark, it seemed to be the Temp tablespace, so spreading objects is not an
option. Since it seems to be a big file tablespace neither is adding more files
- which might otherwise address the concurrency.
Using a tablespace group, though, might bypass the problem.
Regards
Jonathan Lewis
(From my iPad mini; please excuse typos and auto-correct)
On 14 Nov 2023, at 13:52, Mark W. Farnham <mwf@xxxxxxxx> wrote:
Just curious. Now that JL and others have documented the way to get the answer
to your question, is your problem solved?
Reminding that this is a contention issue (apparently not an i/o throughput
problem), the trivial solution might be to un-dither objects from the
tablespace evenly into multiple tablespaces based on totals of segments
requested (and possibly released via truncate for some processes). It’s
possible that some future patches and upgrades may require that you move the
objects back to the default tablespaces.
IF the occurrence of contention is periodic on a workday, weekly, monthly,
quarterly, or annual basis, simply arranging in the scheduler to run the
biggest allocators and de-allocators at times askew from each other.
mwf
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Hameed Amir ("amir.hameed")
Sent: Friday, November 10, 2023 1:47 PM
To: Jonathan Lewis
Cc: oracle-l@xxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Re: enq: TS - contention
Hi Jonathan,
Thank you for the explanation. I looked through DBA_HIST views for information
on job A for periods when it finished successfully, and I found the following
statement in all those periods:
DELETE FROM AR_AUTOREM_INTERIM WHERE BATCH_ID = :B1
It is a standard Oracle EBS table. It is partitioned and has no index on it.
Can you please show me how to convert P2 into TS name below?
Event = enq: TS – contention
P1 = 1414725635
P2= 196611
P3= 2
Thank you,
Amir
From: Jonathan Lewis <jlewisoracle@xxxxxxxxx>
Sent: Thursday, November 9, 2023 6:13 PM
To: Hameed, Amir <amir.hameed@xxxxxxxxxxxxxxx>
Cc: oracle-l@xxxxxxxxxxxxx
Subject: [EXTERNAL] Re: enq: TS - contention
CAUTION: External source
I don't know all the options that might appear for the TS enqueue, but it is
requested during the creation and dropping of many (probably all) types of
segments. It's not restricted to temporary segments in the temporary
tablespace.
The p1, p2, p3 values for the event are
name || mode, (0x5453000n) --- n is the mode, values 0 - 6
con_id || ts#, (0xcccctttt) ,,, typical top 4 / bottom 4 hex characters,
but the con_id might not be there
tablespace relative block address
If you can query v$lock id1 will be the (con_id || ts#) and id2 will be the
block address.
For creating a segment Oracle gets the TS lock once in exclusive mode on the
segment header block.
For dropping a segment Oracle gets the TS lock twice in exclusive mode, first
for the segment header block, then for the first "data" block of the segment.
It seems unlikely that the SQL you've shown would lead to any type segment
creation - possibly in a RAC system and with a large number of PDBs, object
types and users some feature of the view definition might cause a SYS-recursive
query to generate a very bad plan with a huge temporary segment - but that
would be a little surprising.
Is there anything you know about procedures A and B that would lead to
excessive numbers of segments being rapidly created and dropped, and possibly
hitting some strange boundary conditions (e.g. lots being created - which take
the TS enqueue - but the purge option not being specified on the drop - which
would mean the TS enqueue wouldn't be taken, but a background might start
dropping segments from the recyclebin when space pressure got too high).
Another hypothetical problem could be excessive demand leading to automatic
file extensions and an undetected deadlock.
I think you need to work out what the procedures are doing around the time of
the fail, and see if that gives you any clues. Perhaps you could be guided by
the SQL_IDs in the ASH data in the couple of minutes leading up to the wait.
Regards
Jonathan Lewis
On Wed, 8 Nov 2023 at 18:02, Hameed Amir <dmarc-noreply@xxxxxxxxxxxxx> wrote:
Hi,
The database version is 19.17.0.0, running on OEL8.
In our Oracle E-Business Suite production database, a batch job (A) runs and
spawns another job (B). Intermittently, job A runs longer and ends up getting
terminated by the user. ASH data shows that the process running job A awaited
the event "enq: TS - contention".
Based on the limited information I have found, this event seems related to the
TEMP tablespace. There is no SQL_ID associated with the session ID waiting on
the event. There is a TOP_LEVEL_SQL_ID, which shows the following SQL statement:
select directory_name from all_directories where directory_path = :1
Has anyone run into this issue? Any feedback will be greatly appreciated.
Thank you,
Amir