Re: Fun With Scan Listener

  • From: David Barbour <david.barbour1@xxxxxxxxx>
  • To: "Mark W. Farnham" <mwf@xxxxxxxx>
  • Date: Tue, 3 Jun 2014 17:53:35 -0500

Thanks guys.  Both the note and the permissions suggestion were helpful.
We added a database to the cluster.  There appears to have been some
transitional understanding of the difference between RAC and SCAN and VIP
configuration vs. traditional standalone single listener and tnsnames
configuration as well as command usage.  However, even when that was ironed
out, I still got the error.  It seems this database was configured with
2000 processes.  All the DBs on the RAC are owned by oracle.  Doc 579365.1 lit
up the dim bulb and I checked max user processes.  1024.  When I add up
everything that's connected and everything that needs to run just to start
the DBs & ASM, I come up with waaaaay more than 1024.  I'm greatly
surprised (and completely thankful) that this didn't crash and burn before
now.

I love this list.


On Tue, Jun 3, 2014 at 5:00 PM, Mark W. Farnham <mwf@xxxxxxxx> wrote:

> Well this is sort of a hunt and poke around problem, but I would start
> with checking owner, group and file permissions and who owns the various
> processes that cannot be stopped or started.
>
>
>
> I'd check the setuid on the programs on rchr1p01 (guessing that is the
> node in question), and if maybe someone started something as root and that
> fubared some permissions.
>
>
>
> Nothing is jumping out of the logs, except I wonder about ohome(null), but
> I haven't seen that error message in context, so that might be okay.
>
>
>
> mwf
>
>
>
> *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto:
> oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *David Barbour
> *Sent:* Tuesday, June 03, 2014 5:27 PM
> *To:* oracle-l mailing list
>
> *Subject:* Fun With Scan Listener
>
>
>
> Oracle 11.2.0.3  RHEL 6.3  5-Node RAC
>
> This has me somewhat (okay -  totally) baffled.  I have a scan listener
> that is showing as follows when I run crsctl status resource -t:
>
>
> --------------------------------------------------------------------------------
> Cluster Resources
>
> --------------------------------------------------------------------------------
> ora.LISTENER_SCAN1.lsnr
>       1        ONLINE  UNKNOWN      rchr1p01
>
> If I check the status via srvctl I get the following:
>
>  $ srvctl status scan_listener
> SCAN Listener LISTENER_SCAN1 is enabled
> SCAN listener LISTENER_SCAN1 is not running
> SCAN Listener LISTENER_SCAN2 is enabled
> SCAN listener LISTENER_SCAN2 is running on node rchr1p02
> SCAN Listener LISTENER_SCAN3 is enabled
> SCAN listener LISTENER_SCAN3 is running on node rchr1p03
>
> So I try to start it:
>
>  $ srvctl start scan_listener -i 1
> PRCR-1079 : Failed to start resource ora.LISTENER_SCAN1.lsnr
> CRS-5013: Agent "/oracle/grid/11203/bin/oraagent.bin" failed to start
> process "/oracle/grid/11203/bin/lsnrctl" for action "clean": details at
> "(:CLSN00008:)" in
> "/oracle/grid/11203/log/rchr1p01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
> CRS-5013: Agent "/oracle/grid/11203/bin/oraagent.bin" failed to start
> process "/oracle/grid/11203/bin/lsnrctl" for action "check": details at
> "(:CLSN00008:)" in
> "/oracle/grid/11203/log/rchr1p01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
> CRS-2680: Clean of 'ora.LISTENER_SCAN1.lsnr' on 'rchr1p01' failed
>
> So I try to stop it:
>
> srvctl stop scan_listener -i 1 -f
> PRCR-1065 : Failed to stop resource ora.LISTENER_SCAN1.lsnr
> CRS-5013: Agent "/oracle/grid/11203/bin/oraagent.bin" failed to start
> process "/oracle/grid/11203/bin/lsnrctl" for action "clean": details at
> "(:CLSN00008:)" in
> "/oracle/grid/11203/log/rchr1p01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
> CRS-5013: Agent "/oracle/grid/11203/bin/oraagent.bin" failed to start
> process "/oracle/grid/11203/bin/lsnrctl" for action "check": details at
> "(:CLSN00008:)" in
> "/oracle/grid/11203/log/rchr1p01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
> CRS-2680: Clean of 'ora.LISTENER_SCAN1.lsnr' on 'rchr1p01' failed
>
> So I give up and check the log.
>
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][2617243392]
> {1:53466:11150} [clean] clsn_agent::clean: Exception
> SclsProcessSpawnException
> 2014-06-03 16:19:48.666: [    AGFW][3623876352] {1:53466:11150} Agent
> sending reply for: RESOURCE_CLEAN[ora.LISTENER_SCAN1.lsnr 1 1] ID 4100:58347
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][2617243392]
> {1:53466:11150} [clean] (:CLSN00106:) clsn_agent::clean }
> 2014-06-03 16:19:48.666: [    AGFW][2617243392] {1:53466:11150} Command:
> clean for resource: ora.LISTENER_SCAN1.lsnr 1 1 completed with status: FAIL
> 2014-06-03 16:19:48.666: [    AGFW][3623876352] {1:53466:11150} Agent
> sending reply for: RESOURCE_CLEAN[ora.LISTENER_SCAN1.lsnr 1 1] ID 4100:58347
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] LsnrAgent::check {
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] lsnrctl status LISTENER_SCAN1
>
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] getOracleHomeAttrib: oracle_home =
> /oracle/grid/11203
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] getOracleHomeAttrib: oracle_home =
> /oracle/grid/11203
> 2014-06-03 16:19:48.666: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Utils::getCrsHome crsHome /oracle/grid/11203
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Utils::execCmd 1
> USR_ORA_ENV:ORACLE_BASE=/opt/oracle oracleHome:/oracle/grid/11203
> CrsHome:/oracle/grid/11203
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Utils::getCrsHome crsHome /oracle/grid/11203
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Adding Environment Variables
> ORACLE_HOME=/oracle/grid/11203
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Adding Environment Variables
> TNS_ADMIN=/oracle/grid/11203/network/admin/
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Adding Environment variable from USR_ORA_ENV
> ORACLE_BASE=/opt/oracle
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] Utils:execCmd action = 3 flags = 38 ohome = (null)
> cmdname = lsnrctl.
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] getOracleHomeAttrib: oracle_home =
> /oracle/grid/11203
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] (:CLSN00008:)Utils:execCmd scls_process_spawn()
> failed 1
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] (:CLSN00008:) category: -2, operation: fork, loc:
> spawnproc28, OS error: 11, other: forked failed [-1]
> 2014-06-03 16:19:48.667: [   AGENT][3087005440] {1:53466:11150}
> UserErrorException: Locale is
> 2014-06-03 16:19:48.667: [ora.LISTENER_SCAN1.lsnr][3087005440]
> {1:53466:11150} [check] clsnUtils::error Exception type=2 string=
> CRS-5013: Agent "/oracle/grid/11203/bin/oraagent.bin" failed to start
> process "/oracle/grid/11203/bin/lsnrctl" for action "check": details at
> "(:CLSN00008:)" in
> "/oracle/grid/11203/log/rchr1p01/agent/crsd/oraagent_oracle/oraagent_oracle.log"
>
> I'm willing to go with the part about 'UserErrorException', except I'm not
> aware of what I'm doing wrong.  Looking through MOS docs but hoping someone
> has a suggestion?
>

Other related posts: