Re: Scheduler Jobs are not distributed according to OS-load on RAC noes

From: Martin Berger <martin.a.berger@xxxxxxxxx>
To: Niall Litchfield <niall.litchfield@xxxxxxxxx>
Date: Tue, 11 Dec 2018 10:26:02 +0100

Hi Niall,

the service has these properties:

Service name: OUR_SERVICE_NAME
Server pool:
Cardinality: 4
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Global: false
Commit Outcome: false
Failover type:
Failover method:
TAF failover retries:
TAF failover delay:
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name:
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency:
GSM Flags: 0
Service is enabled
Preferred instances: INST1,INST2,INST3,INST4
Available instances:
CSS critical: no

it's worth to mention: the connections at which the jobs are scheduled come
from another DB via DB-Link.

thank you,
Martin

Am Di., 11. Dez. 2018 um 09:47 Uhr schrieb <niall.litchfield@xxxxxxxxx>:

Hi Martin

What are the load balancing properties of the service set to?
On Tue, Dec 11, 2018 at 8:45 AM Martin Berger <martin.a.berger@xxxxxxxxx>
wrote:

Hi List,

I have a strange situation with a 4-node RAC - 12.2 (July 2018) Oracle

Linux 6.10:

After some time, one (or several) instances stop executing jobs.

Every hour we are scheduling a lot of one-time jobs to run a lot of data

loads. The Jobs are scheduled by a master which takes care of dependencies
- so a job is only scheduled, when all it dependencies are met and should
run as soon as resources (job processes) are available. (No dependencies
are defined in dbms-scheduler framework).

The jobs use a JOB_CLASS which as a dedicated SERVICE - this SERVICE is

available on all 4 instances. Stop&Start of the service on the "idle"
instance does not help.

NTP is fine according to cluvfy comp clocksync -n all .
instance_stickiness  is TRUE (the default) - but I don't think this will

change anything as our jobs run one-time only.

Does anyone know how to identify, why sometimes some instances refuse to

run scheduled jobs?

Who is doing this decision, and can it be traced somehow to identify

based on which numbers the decision is done?

Any other suggestions?

A SR at MOs is open, but without any progress.

related documents found so far:

DBMS_SCHEDULER job doesn't fail-over across RAC instance ( Doc ID

2365434.1 )

RAC Node X Is Seeing A Higher Session Load Than The Other Nodes For

Scheduler Jobs ( Doc ID 1602581.1 )

ENH 28592547 - REAL-TIME LOAD BALANCING FOR JOBS ACROSS RAC INSTANCES

--
Martin Berger                Oracle ♠
martin.a.berger@xxxxxxxxx @martinberx
^∆x      http://berxblog.blogspot.com

--
Niall Litchfield
Oracle DBA
http://www.orawin.info

Follow-Ups:
- Re: Scheduler Jobs are not distributed according to OS-load on RAC noes
  - From: Mladen Gogala

References:
- Scheduler Jobs are not distributed according to OS-load on RAC noes
  - From: Martin Berger
- Re: Scheduler Jobs are not distributed according to OS-load on RAC noes
  - From: niall . litchfield

Re: Scheduler Jobs are not distributed according to OS-load on RAC noes

Other related posts: