Archive

Archive for the ‘Oracle on AIX’ Category

Oracle Grid Infrastructure on AIX: “gipchaLowerProcessNode: no valid interfaces found”

Problem description:

While upgrading or installing Oracle Grid Infrastructure 11.2.0.2 root.sh/rootupgrade.sh fails on the second node, altough it executed without any issues on the first node:

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node aix01, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Timed out waiting for the CRS stack to start.
/u01/app/grid/11.2.0/grid/perl/bin/perl -I/u01/app/grid/11.2.0/grid/perl/lib -I/u01/app/grid/11.2.0/grid/crs/install /u01/app/grid/11.2.0/grid/crs/install/rootcrs.pl execution failed
root@aix02 /tmp #

The Clusterware log file $GRID_HOME/log/$HOSTNAME/alert$HOSTNAME.log contains an error messages that CRSD has failed:

2011-03-13 16:08:57.430
[ohasd(5898406)]CRS-2765:Resource 'ora.crsd' has failed on server 'aix02'.

When analyzing the $GRID_HOME/log/$HOSTNAME/crsd/crsd.log file you are finding “gipchaLowerProcessNode: no valid interfaces found” error messages:

2011-03-13 16:08:59.735: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781285081 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], c
reateTime 2781285080, flags 0x4 }
2011-03-13 16:09:04.741: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781290086 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [5 : 5], c
reateTime 2781285080, flags 0x4 }
2011-03-13 16:09:09.750: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781295095 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [10 : 10],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:14.751: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781300097 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [15 : 15],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:19.760: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781305106 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [20 : 20],
 createTime 2781285080, flags 0x4 }
2011-03-13 16:09:24.770: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2781310115 ms, node 112350770 { host 'aix01'
, haName '22b2-2ce9-c25b-03c6', srcLuid 5652a9fe-0e07d4a4, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [25 : 25],
 createTime 2781285080, flags 0x4 }

Cause:

  • AIX:
    On AIX this problem is caused by an incorrect “udp_sendspace” setting. You can check your current value by executing the following command:

    no -o udp_sendspace

    Example:

    root@aix02 / # no -o udp_sendspace
    udp_sendspace = 65536
    root@aix02 / #

    Additionally you should check the other network related parameters which are required for Oracle Grid Infrastructure 11.2.0.2:

    AIX network parameter recommended/required value
    ipqmaxlen 512
    rfc1323 1
    sb_max 4194304
    tcp_recvspace 65536
    tcp_sendspace 65536
    udp_recvspace 655360
    udp_sendspace 65536

  • Other UNIX-systems:
    On other UNIX-systems you might want to check your netmask for the interconnect interfaces (255.255.255.0).

Problem resolution on AIX:

  1. Check and set the correct values for the required network related parameters on all nodes:

    no -o ipqmaxlen=512
    no -o rfc1323=1
    no -o sb_max=4194304
    no -o tcp_recvspace=65536
    no -o tcp_sendspace=65536
    no -o udp_recvspace=65536
    no -o udp_sendspace=65536
  2. In order to make this changes permanent checkout http://download.oracle.com/docs/cd/E11882_01/install.112/e17210/preaix.htm#CWAIX219

    Example if you are not running in compatibility mode:

    no -o -r ipqmaxlen=512
    no -o -p rfc1323=1
    no -o -p sb_max=4194304
    no -o -p tcp_recvspace=65536
    no -o -p tcp_sendspace=65536
    no -o -p udp_recvspace=65536
    no -o -p udp_sendspace=65536
  3. Stop CRS on all nodes:

    crsctl stop crs -f
    ps -ef |grep d.bin
  4. Restart CRS on the first node:

    crsctl start crs
  5. On the node were root.sh/rootupgrade.sh failed, rerun the script

    Example for reexecuting root.sh:

    root@aix02 /tmp # /u01/app/grid/11.2.0/grid/root.sh
    Running Oracle 11g root script...
     
    The following environment variables are set as:
        ORACLE_OWNER= grid
        ORACLE_HOME=  /u01/app/grid/11.2.0/grid
     
    Enter the full pathname of the local bin directory: [/usr/local/bin]:
    The contents of "dbhome" have not changed. No need to overwrite.
    The contents of "oraenv" have not changed. No need to overwrite.
    The contents of "coraenv" have not changed. No need to overwrite.
     
    Entries will be added to the /etc/oratab file as needed by
    Database Configuration Assistant when a database is created
    Finished running generic part of root script.
    Now product-specific root actions will be performed.
    Using configuration parameter file: /u01/app/grid/11.2.0/grid/crs/install/crsconfig_params
    Adding daemon to inittab
    Configure Oracle Grid Infrastructure for a Cluster ... succeeded
    root@aix02 /tmp #

Oracle Grid Infrastructure on AIX: “0403-006 Execute permission denied.” during installation of binaries.

Problem description:

While installing Grid Infrastructure on AIX, you are receiving an error when Oracle Universal Installer starts to link the binaries. The OUI log file contains the error message “0403-006 Execute permission denied.”:

INFO: Exception thrown from action: make
Exception Name: MakefileException
Exception String: Error in invoking target 'sdo_on no_opts asm_on rac_on' of makefile '/u01/app/grid/11.2.0/grid/rdbms/lib/ins_rdbms.mk'. See '/u01/app/oraInventory/logs/installActions2011-03-12_11-32-45PM.log' for details.
Exception Severity: 1
INFO: Linking sdo Options
INFO: Linking sdo Options
INFO: The output of this make operation is also available at: '/u01/app/grid/11.2.0/grid/install/make.log'
INFO:
 
INFO: Start output from spawned process:
INFO: ----------------------------------
INFO:
 
INFO:   /bin/ar -X64 cr /u01/app/grid/11.2.0/grid/rdbms/lib/libknlopt.a /u01/app/grid/11.2.0/grid/rdbms/lib/kxmwsd.o
 
INFO: /bin/sh: /u01/app/grid/11.2.0/grid/bin/echodo: 0403-006 Execute permission denied.
 
INFO: make: 1254-004 The error code from the last command is 126.

Cause:
In this case the error message was quite misleading. The problem was not caused by a permission issue. It was introduced by another process using some space on the /tmp file system. Oracle Universal Installer was not able to allocate any further temporary space on the /tmp file system, because it was completely filled up and therefore raised above error message. Even if you would use the environment variables TMPDIR and TEMPDIR, OUI would still use /tmp for some tasks and 1GB of free space would not be enough.

Problem resolution:
Increase the free space on /tmp (2GB of total space should be fine) and press retry in the Oracle Universal Installer or cleanup and restart your installation if you have already cancelled it.

root@aix01 / # df -g /tmp
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/hd3           2.00      1.99    1%      151     1% /tmp
root@aix01 / #