Archive

Archive for the ‘Real Application Cluster’ Category

Oracle Clusterware / Grid Infrastructure: How to determine the configured name of your cluster?

Problem description:

You need to know the cluster name defined during the installation of Oracle Clusterware / Oracle Grid Infrastructure. Maybe because you are configuring Oracle Enterprise Manager Database Console for one of your RAC Databases via emca:

[oracle@racn01 ~]$ emca -config dbcontrol db -repos create -cluster
 
STARTED EMCA at May 27, 2012 10:42:10 AM
EM Configuration Assistant, Version 11.2.0.3.0 Production
Copyright (c) 2003, 2011, Oracle.  All rights reserved.
 
Enter the following information:
Database unique name: RACDB
Service name: RACDB
Listener port number: 1521
Listener ORACLE_HOME [ /oracle/app/grid/11.2.0/grid ]:
Password for SYS user: 
Password for DBSNMP user: 
Password for SYSMAN user: 
Cluster name:

 
Problem resolution:

Oracle Clusterware’s command “cemutlo” can be used to determine the name defined for your cluster during the installation:

[root@racn01 ~]# /oracle/app/grid/11.2.0/grid/bin/cemutlo -n
playground
[root@racn01 ~]#

 

AOUG Experts Forum 13.10.2011: Storage Technologies for Oracle Database Systems and Best Practices

October 14th, 2011 Matthias Pölzinger No comments

InITSo was invited to hold a lecture on Oracle ACFS / Oracle Cloud File System at the Austrian Oracle User Group’s Experts Forum on Storage Technologies for Oracle Database Systems – Best Practices (“AOUG Expertentreff: Storage Technologien Oracle Datenbanksystem – Best Practices“).

If you are interested in this topic, you can download an English or German version of the presentation via the following links:

 

Oracle ASM: $ORACLE_HOME/rdbms/audit keeps increasing in total size and number of files

Problem description:

The rdbms/audit directory of your Grid Infrastructure is increasing permanently in number of files and total size:

[grid@rac01 ~]$ du -hs $ORACLE_HOME/rdbms/audit
1151M	/u01/app/grid/11.2.0/grid/rdbms/audit
[grid@rac01 ~]$ 
[grid@rac01 ~]$ cd $ORACLE_HOME/rdbms/audit
[grid@rac01 audit]$ 
[grid@rac01 audit]$ ls -l | wc -l
1112896
[grid@rac01 audit]$

 
Permanent increase in number of files and directory size can cause the file system to run out of free space and may also have performance impact on your ASM instance.

 
Cause:

Audit files are created by every connection as user sys. In a Real Application Cluster environment with Grid Control in place, this might become a problem (although you might want to store this information for a limited time due to security compliance reasons).

Example of an ASM .aud-file:

[grid@rac01 audit]$ cat +asm1_ora_9981_2.aud
Audit file /u01/app/grid/11.2.0/grid/rdbms/audit/+asm1_ora_9981_2.aud
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
ORACLE_HOME = /u01/app/grid/11.2.0/grid
System name:	Linux
Node name:	rac01.initso.at
Release:	2.6.18-194.el5
Version:	#1 SMP Mon Mar 29 22:10:29 EDT 2010
Machine:	x86_64
Instance name: +ASM1
Redo thread mounted by this instance: 0 <none>
Oracle process number: 32
Unix process pid: 9981, image: oracle@rac01.initso.at (TNS V1-V3)
 
Tue Jul  5 12:12:06 2011 +02:00
LENGTH : '144'
ACTION :[7] 'CONNECT'
DATABASE USER:[1] '/'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[6] 'oracle'
CLIENT TERMINAL:[0] ''
STATUS:[1] '0'
DBID:[0] ''
 
[grid@rac01 audit]$

 
Workaround:

Old files can be deleted without any impact. Each SYS connection creates a new audit file. Old aud-files should not be open by any Oracle ASM process.

In order to cleanup old files you might want to use one or both of the following methods:

  • Manual cleanup

    You can cleanup aud-files manually by running a find command similar to the following:

    [grid@rac01 audit]$ find /u01/app/grid/11.2.0/grid/rdbms/audit -maxdepth 1 -name '*.aud' -mtime +30 -delete -print
    /u01/app/grid/11.2.0/grid/rdbms/audit/+asm1_ora_24456_2.aud
    ...
    ...
    /u01/app/grid/11.2.0/grid/rdbms/audit/+asm1_ora_9006_1.aud
    [grid@rac01 audit]$

     

  • Automatic cleanup after 30 days

    Create a cronjob similar to the following by using “crontab -e” as the Grid Infrastructure user (e.g. grid):

    [grid@rac01 ~]$ crontab -l
    # Daily cleanup job for Oracle ASM aud-files not modified in the last 30 days
    30 0 * * * /usr/bin/find /u01/app/grid/11.2.0/grid/rdbms/audit -maxdepth 1 -name '*.aud' -mtime +30 -delete >/dev/null 2>&1
    [grid@rac01 ~]$

Oracle ACFS: How to check if ACFS is supported with the kernel currently in use or gather other ACFS driver related information

Description:

You want to check if the ACFS driver can be used with the kernel currently in use. In addition you might want to gather other ACFS driver related information like if it is installed, loaded or which version is used.

 
Commands:

The acfsdriverstate command enables us to gather ACFS driver related information (especially if ACFS is supported with the kernel in use). This binary is part of the Oracle Grid Infrastructure and can be executed as Grid Infrastructure user (e.g. grid) or root.

The following information can be gathered:

  • If ACFS can be used with the kernel currently in use:
    [grid@rac01 bin]$ $ORACLE_HOME/bin/acfsdriverstate -orahome $ORACLE_HOME supported
    ACFS-9200: Supported
    [grid@rac01 bin]$

     

  • If ACFS driver is installed:
    [grid@rac01 bin]$ $ORACLE_HOME/bin/acfsdriverstate -orahome $ORACLE_HOME installed
    ACFS-9203: true
    [grid@rac01 bin]$

     

  • If ACFS driver is loaded:
    [grid@rac01 bin]$ $ORACLE_HOME/bin/acfsdriverstate -orahome $ORACLE_HOME loaded
    ACFS-9204: false
    [grid@rac01 bin]$

     

  • Version of ACFS driver:
    [grid@rac01 bin]$ $ORACLE_HOME/bin/acfsdriverstate -orahome $ORACLE_HOME version
    ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(x86_64).
    ACFS-9326:     Driver Oracle version = 100804.1.
    [grid@rac01 bin]$

     

Oracle Grid Infrastructure: roothas.pl fails with “Oracle Restart stack is not active on this node” / How to forcefully deconfig all old Grid Infrastructure information

Problem description:

root.sh fails during execution for an Oracle Grid Infrastructure 11.2 installation with the following message:

[root@ora01 ~]# /u01/app/grid/11.2.0/grid/root.sh
Running Oracle 11g root script...
 
The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/grid/11.2.0/grid
 
Enter the full pathname of the local bin directory: [/usr/local/bin]: 
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
 
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/grid/11.2.0/grid/crs/install/crsconfig_params
Improper Oracle Clusterware configuration found on this host
Deconfigure the existing cluster configuration before starting
to configure a new Clusterware 
run '/u01/app/grid/11.2.0/grid/crs/install/roothas.pl -deconfig' 
to configure existing failed configuration and then rerun root.sh
/u01/app/grid/11.2.0/grid/perl/bin/perl -I/u01/app/grid/11.2.0/grid/perl/lib -I/u01/app/grid/11.2.0/grid/crs/install /u01/app/grid/11.2.0/grid/crs/install/roothas.pl execution failed
[root@ora01 ~]#

 
When executing roothas.pl as mentioned by root.sh, an “Oracle Restart stack is not active” error is raised. Output messages will ask you to restart the SIHA stack (although you might not have any Oracle Clusterware stack configured):

[root@ora01 ~]# /u01/app/grid/11.2.0/grid/crs/install/roothas.pl -deconfig
Using configuration parameter file: /u01/app/grid/11.2.0/grid/crs/install/crsconfig_params
Oracle Restart stack is not active on this node
Restart the SIHA stack (use /u01/app/grid/11.2.0/grid/bin/crsctl start has) and retry
Failed to verify HA resources
[root@ora01 ~]#

 
Cause:

These messages can have various causes like a previous failed Grid Infrastructure installation, an old Oracle Clusterware installation, etc.

 
Problem resolution:

In order to forcefully cleanup old configuration information execute roothas.pl with the force option:

[root@ora01 grid]# /u01/app/grid/11.2.0/grid/crs/install/roothas.pl -deconfig -force -verbose
Using configuration parameter file: /u01/app/grid/11.2.0/grid/crs/install/crsconfig_params
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /u01/app/grid/11.2.0/grid/bin/crsctl stop resource ora.cssd -f
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /u01/app/grid/11.2.0/grid/bin/crsctl delete resource ora.cssd -f
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /u01/app/grid/11.2.0/grid/bin/crsctl stop has -f
Failure in execution (rc=-1, 256, No such file or directory) for command 1 /u01/app/grid/11.2.0/grid/bin/crsctl check has
You must kill ohasd processes or reboot the system to properly 
cleanup the processes started by Oracle clusterware
/u01/app/grid/11.2.0/grid/bin/acfsdriverstate: line 51: /lib/acfstoolsdriver.sh: No such file or directory
/u01/app/grid/11.2.0/grid/bin/acfsdriverstate: line 51: exec: /lib/acfstoolsdriver.sh: cannot execute: No such file or directory
Successfully deconfigured Oracle Restart stack
[root@ora01 grid]#

 
This cleaned up all stale information and configuration and allowed a successful rerun of root.sh.

Oracle: ORA-00245: control file backup operation failed when trying to perform an actively load balanced RMAN backup in Real Application Cluster

Problem description:

You are running an Oracle Real Application Cluster Version >= 11.2.0.1 and are trying to perform an actively load balanced backup operation on more than one database instance. The backup of the control file fails with the following error message:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
 
RMAN-03009: failure of backup command on dev_0 channel at 06/24/2011 11:11:24
ORA-00245: control file backup operation failed

 
Complete output:

RMAN> run {
  allocate channel 'dev_0' type 'sbt_tape'
    parms 'SBT_LIBRARY=/opt/omni/lib/libob2oracle8_64bit.so,ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=racdb,OB2BARLIST=VTL-ORA-racdb-rac-scan)'
    connect 'sys@racdb1';
  allocate channel 'dev_1' type 'sbt_tape'
    parms 'SBT_LIBRARY=/opt/omni/lib/libob2oracle8_64bit.so,ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=racdb,OB2BARLIST=VTL-ORA-racdb-rac-scan)'
    connect 'sys@racdb2';
 
  send device type 'sbt_tape' 'OB2BARHOSTNAME=rac-scan.intern.initso.at';
  backup incremental level 0 format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' database;
  backup format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' archivelog all not backed up 2 times;
  backup format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' current controlfile;
 } 
 
using target database control file instead of recovery catalog
allocated channel: dev_0
channel dev_0: SID=51 instance=racdb1 device type=SBT_TAPE
channel dev_0: Data Protector A.06.11/PHSS_41802/PHSS_41803/DPSOL_00435/DPLNX_
 
allocated channel: dev_1
channel dev_1: SID=19 instance=racdb2 device type=SBT_TAPE
channel dev_1: Data Protector A.06.11/PHSS_41802/PHSS_41803/DPSOL_00435/DPLNX_
 
sent command to channel: dev_1
sent command to channel: dev_0
 
Starting backup at 24-JUN-11
channel dev_0: starting incremental level 0 datafile backup set
channel dev_0: specifying datafile(s) in backup set
input datafile file number=00002 name=+DATA01/racdb/datafile/sysaux.335.752504897
input datafile file number=00006 name=+DATA01/racdb/datafile/undotbs2.364.753481801
input datafile file number=00008 name=+DATA01/racdb/datafile/undotbs4.362.753482033
input datafile file number=00005 name=+DATA01/racdb/datafile/owbsys.402.753454499
input datafile file number=00010 name=+DATA01/racdb/datafile/undotbs6.360.753482663
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_1: starting incremental level 0 datafile backup set
channel dev_1: specifying datafile(s) in backup set
input datafile file number=00004 name=+DATA01/racdb/datafile/users.332.752504905
input datafile file number=00001 name=+DATA01/racdb/datafile/system.336.752504891
input datafile file number=00003 name=+DATA01/racdb/datafile/undotbs1.334.752504899
input datafile file number=00007 name=+DATA01/racdb/datafile/undotbs3.363.753481931
input datafile file number=00009 name=+DATA01/racdb/datafile/undotbs5.361.753482293
channel dev_1: starting piece 1 at 24-JUN-11
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_163:754485037:1>.dbf tag=TAG20110624T111037 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:46
channel dev_0: starting incremental level 0 datafile backup set
channel dev_0: specifying datafile(s) in backup set
channel dev_1: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_164:754485038:1>.dbf tag=TAG20110624T111037 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_1: backup set complete, elapsed time: 00:00:45
channel dev_1: starting incremental level 0 datafile backup set
channel dev_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel dev_1: starting piece 1 at 24-JUN-11
RMAN-03009: failure of backup command on dev_0 channel at 06/24/2011 11:11:24
ORA-00245: control file backup operation failed
continuing other job steps, job failed will not be re-run
channel dev_1: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_166:754485083:1>.dbf tag=TAG20110624T111037 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_1: backup set complete, elapsed time: 00:00:04
released channel: dev_0
released channel: dev_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
 
RMAN-03009: failure of backup command on dev_0 channel at 06/24/2011 11:11:24
ORA-00245: control file backup operation failed
 
RMAN>

 
Cause:

Oracle changed the mechanism on how control file backups are performed. Until 11.2 Oracle requested a control file enqueue before performing the actual backup. Currently with 11.2.0.1 and 11.2.0.2, Oracle performs a control file backup without requesting a control file enqueue, but instead requires that during backup, the snapshot controlfile is shared between all instances. For non-RAC databases this does not change anything and neither it does for RAC databases if you always perform your backup on the same RAC-instance (without allocating any channel on another instance).

But if your are making use of distributing your backup load over several instances and your snapshot control file is located in you non-shared Oracle Home, you will end up with an “ORA-00245: control file backup operation failed”.

 
Problem resolution:

You have to configure your snapshot control file destination to a shared location accessible by all Oracle RAC instances. In our case, an ASM Cluster Filesystem was available due to the storage of external application data.

Change of snapshot controlfile destintation:

RMAN> CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/acfs_app/snapcf/racdb/snapcf_racdb.f';
 
new RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/acfs_app/snapcf/racdb/snapcf_racdb.f';
new RMAN configuration parameters are successfully stored
 
RMAN>

 
After changing the snapshot controlfile location to a directory located on ACFS, the backup did execute without any issues:

RMAN> run {
  allocate channel 'dev_0' type 'sbt_tape'
    parms 'SBT_LIBRARY=/opt/omni/lib/libob2oracle8_64bit.so,ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=racdb,OB2BARLIST=VTL-ORA-racdb-rac-scan)'
    connect 'sys@racdb1';
  allocate channel 'dev_1' type 'sbt_tape'
    parms 'SBT_LIBRARY=/opt/omni/lib/libob2oracle8_64bit.so,ENV=(OB2BARTYPE=Oracle8,OB2APPNAME=racdb,OB2BARLIST=VTL-ORA-racdb-rac-scan)'
    connect 'sys@racdb2';
 
  send device type 'sbt_tape' 'OB2BARHOSTNAME=rac-scan.intern.initso.at';
  backup incremental level 0 format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' database;
  backup format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' archivelog all not backed up 2 times;
  backup format 'VTL-ORA-racdb-rac-scan<racdb_%s:%t:%p>.dbf' current controlfile;
 } 
using target database control file instead of recovery catalog
allocated channel: dev_0
channel dev_0: SID=19 instance=racdb1 device type=SBT_TAPE
channel dev_0: Data Protector A.06.11/PHSS_41802/PHSS_41803/DPSOL_00435/DPLNX_
 
allocated channel: dev_1
channel dev_1: SID=19 instance=racdb2 device type=SBT_TAPE
channel dev_1: Data Protector A.06.11/PHSS_41802/PHSS_41803/DPSOL_00435/DPLNX_
 
sent command to channel: dev_1
sent command to channel: dev_0
 
Starting backup at 24-JUN-11
channel dev_0: starting incremental level 0 datafile backup set
channel dev_0: specifying datafile(s) in backup set
input datafile file number=00002 name=+DATA01/racdb/datafile/sysaux.335.752504897
input datafile file number=00006 name=+DATA01/racdb/datafile/undotbs2.364.753481801
input datafile file number=00008 name=+DATA01/racdb/datafile/undotbs4.362.753482033
input datafile file number=00005 name=+DATA01/racdb/datafile/owbsys.402.753454499
input datafile file number=00010 name=+DATA01/racdb/datafile/undotbs6.360.753482663
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_1: starting incremental level 0 datafile backup set
channel dev_1: specifying datafile(s) in backup set
input datafile file number=00004 name=+DATA01/racdb/datafile/users.332.752504905
input datafile file number=00001 name=+DATA01/racdb/datafile/system.336.752504891
input datafile file number=00003 name=+DATA01/racdb/datafile/undotbs1.334.752504899
input datafile file number=00007 name=+DATA01/racdb/datafile/undotbs3.363.753481931
input datafile file number=00009 name=+DATA01/racdb/datafile/undotbs5.361.753482293
channel dev_1: starting piece 1 at 24-JUN-11
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_167:754485274:1>.dbf tag=TAG20110624T111433 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:35
channel dev_0: starting incremental level 0 datafile backup set
channel dev_0: specifying datafile(s) in backup set
including current control file in backup set
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_1: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_168:754485274:1>.dbf tag=TAG20110624T111433 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_1: backup set complete, elapsed time: 00:00:36
channel dev_1: starting incremental level 0 datafile backup set
channel dev_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel dev_1: starting piece 1 at 24-JUN-11
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_169:754485309:1>.dbf tag=TAG20110624T111433 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:07
channel dev_1: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_170:754485310:1>.dbf tag=TAG20110624T111433 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_1: backup set complete, elapsed time: 00:00:07
Finished backup at 24-JUN-11
 
Starting backup at 24-JUN-11
current log archived
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_2_seq_32.373.754437613; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_2_seq_33.388.754437685; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_34.1310.754445543; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_35.1319.754445549; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_36.1323.754445553; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_37.1326.754445557; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_38.953.754462269; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_2_seq_39.479.754483331; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_3_seq_40.859.754427105; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_3_seq_41.1316.754445547; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_3_seq_42.1328.754445559; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_3_seq_43.1247.754462267; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_3_seq_44.972.754483331; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_4_seq_30.885.754434005; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_4_seq_31.1315.754445545; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_4_seq_32.1332.754445559; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_4_seq_33.791.754462267; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_4_seq_34.276.754483333; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_5_seq_29.1220.754434005; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_5_seq_30.1313.754445543; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_5_seq_31.1329.754445559; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_5_seq_32.723.754462267; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_5_seq_33.901.754483331; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_21/thread_6_seq_26.386.754437615; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_6_seq_27.1322.754445551; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_6_seq_28.453.754462267; already backed up 2 time(s)
skipping archived log file +FRA01/racdb/archivelog/2011_06_22/thread_6_seq_29.889.754483333; already backed up 2 time(s)
channel dev_0: starting archived log backup set
channel dev_0: specifying archived log(s) in backup set
input archived log thread=2 sequence=40 RECID=212 STAMP=754483597
input archived log thread=3 sequence=45 RECID=208 STAMP=754483595
input archived log thread=5 sequence=34 RECID=209 STAMP=754483595
input archived log thread=6 sequence=30 RECID=210 STAMP=754483596
input archived log thread=4 sequence=35 RECID=211 STAMP=754483596
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_1: starting archived log backup set
channel dev_1: specifying archived log(s) in backup set
input archived log thread=3 sequence=46 RECID=216 STAMP=754485320
input archived log thread=5 sequence=35 RECID=217 STAMP=754485321
input archived log thread=6 sequence=31 RECID=213 STAMP=754485319
channel dev_1: starting piece 1 at 24-JUN-11
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_171:754485324:1>.dbf tag=TAG20110624T111523 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:07
channel dev_0: starting archived log backup set
channel dev_0: specifying archived log(s) in backup set
input archived log thread=4 sequence=36 RECID=214 STAMP=754485319
input archived log thread=2 sequence=41 RECID=215 STAMP=754485320
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_1: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_172:754485324:1>.dbf tag=TAG20110624T111523 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_1: backup set complete, elapsed time: 00:00:07
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_173:754485331:1>.dbf tag=TAG20110624T111523 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:07
Finished backup at 24-JUN-11
 
 
Starting backup at 24-JUN-11
channel dev_0: starting full datafile backup set
channel dev_0: specifying datafile(s) in backup set
including current control file in backup set
channel dev_0: starting piece 1 at 24-JUN-11
channel dev_0: finished piece 1 at 24-JUN-11
piece handle=VTL-ORA-racdb-rac-scan<racdb_174:754485339:1>.dbf tag=TAG20110624T111539 comment=API Version 2.0,MMS Version 65.6.11.0
channel dev_0: backup set complete, elapsed time: 00:00:07
Finished backup at 24-JUN-11
 
released channel: dev_0
 
released channel: dev_1
 
RMAN>

Oracle: “ORA-12545: Connect failed because target host or object does not exist” when trying to connect through SCAN-Listeners

Problem description:

You are trying to connect to a database and are receiving the following error message:

[oracle@ls01 admin]$ sqlplus system@racdb
 
SQL*Plus: Release 10.2.0.5.0 - Production on Sat Jun 18 08:47:47 2011
 
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
 
Enter password: 
ERROR:
ORA-12545: Connect failed because target host or object does not exist
 
 
Enter user-name:

 
Cause:

Normally this error is generated by specifying a wrong hostname with the ADDRESS parameters or by specifying a hostname which can not be looked up:

12545, 00000, "Connect failed because target host or object does not exist"
// *Cause: The address specified is not valid, or the program being 
// connected to does not exist.
// *Action: Ensure the ADDRESS parameters have been entered correctly; the
// most likely incorrect parameter is the node name.  Ensure that the 
// executable for the server exists (perhaps "oracle" is missing.)
// If the protocol is TCP/IP, edit the TNSNAMES.ORA file to change the
// host name to a numeric IP address and try again.

 
In this case, we have a tnsnames.ora-entry working fully functional on the database servers and even using IP-addresses:

RACDB =
  (DESCRIPTION =
    (ADDRESS_LIST =
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.200.101)(PORT = 1521))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.200.102)(PORT = 1521))
     (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.200.103)(PORT = 1521))
    )
    (LOAD_BALANCE = yes)
    (CONNECT_DATA =
      (SERVICE_NAME = app199.db.initso.at)
      (SERVER = DEDICATED)
      (FAILOVER_MODE=
        (TYPE=select)
        (METHOD=basic)
        (RETRIES=20)
        (DELAY=15)
      )
    )
  )

 
Above mentioned tnsnames.ora-entry is used to connect to a remote 11g Release 2 RAC-database by utilizing the newly introduced SCAN-Listeners (IP-addresses necessary for clients < 11.2). The tnsnames.ora-entry works fine for clients in the local area network, but if you are trying to connect from another location, you are receiving an ORA-12545 (even though you are not using any hostname).

This is caused by making use of the SCAN-Listeners from a remote site. The SCAN-Listeners will route you to a “normal” VIP-Listener in order to spawn the connection. This routing is based on hostnames and not IP-addresses. If the client receives the hostname for the VIP-Listener to connect to and it is unable to resolve it, you will also see the ORA-12545.

 
Problem resolution:

Enter all Oracle 11g Release 2 RAC relevant IP-addresses and hostnames in your DNS server or your hosts-file.

Example for hosts-file:

192.168.200.201		rac01.initso.at rac01
192.168.200.202		rac02.initso.at rac02
192.168.200.203		rac03.initso.at rac03
 
192.168.200.211		rac01-vip.initso.at rac01-vip
192.168.200.212		rac02-vip.initso.at rac02-vip
192.168.200.213		rac03-vip.initso.at rac03-vip
 
192.168.200.101		rac-scan.initso.at
192.168.200.102		rac-scan.initso.at
192.168.200.103		rac-scan.initso.at

 
After configuring all appropriate hosts-file entries, you should be able to connect without any issue:

[oracle@ls01 admin]$ sqlplus username/password@RACDB
 
SQL*Plus: Release 10.2.0.5.0 - Production on Sat Jun 18 09:01:38 2011
 
Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
 
 
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
 
SQL> select instance_number, instance_name from v$instance;
 
INSTANCE_NUMBER INSTANCE_NAME
--------------- ----------------
	      1 rac1
 
SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
[oracle@ls01 admin]$

Oracle: HOWTO delete a service which is not configured by Oracle Clusterware

Problem description:

You are running a Real Application cluster database and Grid Control reports that one of your database services is down, but all your database services managed by Oracle Clusterware are up and running.

Cause:

Maybe a no longer used service has still an entry in dba_services (that’s one of the views which Grid Control will check). This for example can happen if you change the database domain parameter after installation.

Problem resolution:

Check all database service entries in dba_services:

SQL> SELECT service_id, name, creation_date, enabled FROM dba_services ORDER BY 1;
 
SERVICE_ID NAME                                                             CREATION_DATE   ENA
---------- ---------------------------------------------------------------- --------------- ---
         1 SYS$BACKGROUND                                                   20-MAY-11       NO
         2 SYS$USERS                                                        20-MAY-11       NO
         3 O11GXDB                                                          20-MAY-11       NO
         4 O11G                                                             20-MAY-11       NO
         5 O11G.oracle.initso.at                                            23-MAY-11       NO
         6 O11GFAIL                                                         25-MAY-11       NO
 
6 ROWS selected.
 
SQL>

In my case, O11G was the service Grid Control reported as down, as after changing the database domain to “oracle.initso.at”, the service was no longer used.

If you want to remove a service which is no longer used by Oracle Clusterware or database connections, you can remove it by using the following commands (if the service was configured using srvctl/Oracle Clusterware, please use srvctl to remove the service!):

SQL> EXEC dbms_service.delete_service('O11G');
 
PL/SQL PROCEDURE successfully completed.
 
SQL>

Grid Control will now no longer report the service as down, because it’s no longer known by the database:

SQL> SELECT service_id, name, creation_date, enabled FROM dba_services ORDER BY 1;
 
SERVICE_ID NAME                                                             CREATION_DATE   ENA
---------- ---------------------------------------------------------------- --------------- ---
         1 SYS$BACKGROUND                                                   20-MAY-11       NO
         2 SYS$USERS                                                        20-MAY-11       NO
         3 O11GXDB                                                          20-MAY-11       NO
         5 O11G.oracle.initso.at                                            23-MAY-11       NO
         6 O11GFAIL                                                         25-MAY-11       NO
 
5 ROWS selected.
 
SQL>

Oracle RAC on Linux: PRVF-5449 and PRVF-5431 when executing addNode.sh

Problem description:

Executing addNode.sh in 11.2 results in PRVF-5449 and PRVF-5431 if Voting Disks are located on Oracle ASM Disks:

Checking Oracle Cluster Voting Disk configuration...
 
ERROR:
PRVF-5449 : Check of Voting Disk location "ORCL:GRID01(ORCL:GRID01)" failed on the following nodes:
Check failed on nodes:
        racn02
 
        racn02:No such file or directory
 
ERROR:
PRVF-5449 : Check of Voting Disk location "ORCL:GRID02(ORCL:GRID02)" failed on the following nodes:
Check failed on nodes:
        racn02
 
        racn02:No such file or directory
 
ERROR:
PRVF-5449 : Check of Voting Disk location "ORCL:GRID03(ORCL:GRID03)" failed on the following nodes:
Check failed on nodes:
        racn02
 
        racn02:No such file or directory
 
PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
Time zone consistency check passed
 
 
 
[grid@racn01 bin]$

Although the Oracle ASM Disks are available on the node to add:

[root@racn02 ~]# service oracleasm listdisks | grep GRID
GRID01
GRID02
GRID03
[root@racn02 ~]#

Cause:

addNode.sh is checking Oracle ASM disks incorrectly and will cancel the node addition for Voting Devices on ASM disks.

Problem resolution:

Check manually if the Oracle ASM Disks are available on the nodes to add:

[root@racn02 ~]# service oracleasm listdisks | grep GRID
GRID01
GRID02
GRID03
[root@racn02 ~]#

If the voting disk locations check was the only one that failed, use the environment variable IGNORE_PREADDNODE_CHECKS and rerun addNode.sh. Otherwise resolve the other errors first before continuing.

Example usage of IGNORE_PREADDNODE_CHECKS:

[grid@racn01 bin]$ export IGNORE_PREADDNODE_CHECKS=Y
[grid@racn01 bin]$ ./addNode.sh "CLUSTER_NEW_NODES={racn02}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={racn02-vip}"
...
...

Oracle Grid Infrastructure on AIX: “gipchaLowerProcessNode: no valid interfaces found”

Problem description:

While upgrading or installing Oracle Grid Infrastructure 11.2.0.2 root.sh/rootupgrade.sh fails on the second node, altough it executed without any issues on the first node:

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node aix01, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Timed out waiting for the CRS stack to start.
/u01/app/grid/11.2.0/grid/perl/bin/perl -I/u01/app/grid/11.2.0/grid/perl/lib -I/u01/app/grid/11.2.0/grid/crs/install /u01/app/grid/11.2.0/grid/crs/install/rootcrs.pl execution failed
root@aix02 /tmp #

The Clusterware log file $GRID_HOME/log/$HOSTNAME/alert$HOSTNAME.log contains an error messages that CRSD has failed:

2011-03-13 16:08:57.430
[ohasd(5898406)]CRS-2765:Resource 'ora.crsd' has failed on server 'aix02'.

When analyzing the $GRID_HOME/log/$HOSTNAME/crsd/crsd.log file you are finding “gipchaLowerProcessNode: no valid interfaces found” error messages:

2011-03-13 16:08:59.735: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781285081 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], c
reateTime 2781285080, flags 0x4 }
2011-03-13 16:09:04.741: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781290086 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [5 : 5], c
reateTime 2781285080, flags 0x4 }
2011-03-13 16:09:09.750: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781295095 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [10 : 10],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:14.751: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781300097 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [15 : 15],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:19.760: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781305106 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [20 : 20],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:24.770: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781310115 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [25 : 25],
 createTime 2781285080, flags 0x4 }

Cause:

  • AIX:
    On AIX this problem is caused by an incorrect “udp_sendspace” setting. You can check your current value by executing the following command:

    no -o udp_sendspace

    Example:

    root@aix02 / # no -o udp_sendspace
    udp_sendspace = 65536
    root@aix02 / #

    Additionally you should check the other network related parameters which are required for Oracle Grid Infrastructure 11.2.0.2:

    AIX network parameter recommended/required value
    ipqmaxlen 512
    rfc1323 1
    sb_max 4194304
    tcp_recvspace 65536
    tcp_sendspace 65536
    udp_recvspace 655360
    udp_sendspace 65536

  • Other UNIX-systems:
    On other UNIX-systems you might want to check your netmask for the interconnect interfaces (255.255.255.0).

Problem resolution on AIX:

  1. Check and set the correct values for the required network related parameters on all nodes:

    no -o ipqmaxlen=512
    no -o rfc1323=1
    no -o sb_max=4194304
    no -o tcp_recvspace=65536
    no -o tcp_sendspace=65536
    no -o udp_recvspace=65536
    no -o udp_sendspace=65536
  2. In order to make this changes permanent checkout http://download.oracle.com/docs/cd/E11882_01/install.112/e17210/preaix.htm#CWAIX219

    Example if you are not running in compatibility mode:

    no -o -r ipqmaxlen=512
    no -o -p rfc1323=1
    no -o -p sb_max=4194304
    no -o -p tcp_recvspace=65536
    no -o -p tcp_sendspace=65536
    no -o -p udp_recvspace=65536
    no -o -p udp_sendspace=65536
  3. Stop CRS on all nodes:

    crsctl stop crs -f
    ps -ef |grep d.bin
  4. Restart CRS on the first node:

    crsctl start crs
  5. On the node were root.sh/rootupgrade.sh failed, rerun the script

    Example for reexecuting root.sh:

    root@aix02 /tmp # /u01/app/grid/11.2.0/grid/root.sh
    Running Oracle 11g root script...
     
    The following environment variables are set as:
        ORACLE_OWNER= grid
        ORACLE_HOME=  /u01/app/grid/11.2.0/grid
     
    Enter the full pathname of the local bin directory: [/usr/local/bin]:
    The contents of "dbhome" have not changed. No need to overwrite.
    The contents of "oraenv" have not changed. No need to overwrite.
    The contents of "coraenv" have not changed. No need to overwrite.
     
    Entries will be added to the /etc/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.
    Using configuration parameter file: /u01/app/grid/11.2.0/grid/crs/install/crsconfig_params
    Adding daemon to inittab
    Configure Oracle Grid Infrastructure for a Cluster ... succeeded
    root@aix02 /tmp #