In sweet memories of my ever loving brother "kutty thambi " ARUN KUMAR

Monday, January 4, 2010

Remove the Node from the Cluster in VMware

Now that the instance has been removed (and the ASM instance is applicable), we now need to remove the node from the cluster. This is a manual method performed using scripts that need to be run on the deleted node (if available) to remove the CRS install as well as scripts that should be run on any of the existing nodes (i.e. linux1).

Before proceeding to the steps for removing the node, we need to determine the node name and the CRS-assigned node number for each node stored in the Oracle Cluster Registry. This can be run from any of the existing nodes (linux1 for this example).

$ $ORA_CRS_HOME/bin/olsnodes -n
linux1  1
linux2  2
linux3  3

Now that we have the node name and node number, we can start the steps to remove the node from the cluster. Here are the steps that should be executed from a pre-existing (available) node in the cluster (i.e. linux1):

   1. Run the NETCA utility to remove the network configuration:

      $ DISPLAY=:0.0; export DISPLAY
      $ netca &

      Perform the following steps within the NETCA:

         1. Choose "Cluster Configuration" and click [Next].
         2. Only select the node you are removing and click [Next].
         3. Choose "Listener Configuration" and click [Next].
         4. Choose "Delete" and delete any listeners configured on the node you are removing. Acknowledge the dialog box to delete the listener configuration.

            NOTE: For some reason, I needed to login to linux3 and manually kill the process ID for the listener process.

   2. Run the crs_stat command to verify that all database resources are running on nodes that are going to be kept:

      $ $ORA_CRS_HOME/bin/crs_stat

      For example, verify that the node to be removed is not running any database resources. Look for the record of type:

      NAME=ora..db
      TYPE=application
      TARGET=ONLINE
      STAT=ONLINE on

      Assuming the name of the clustered database is orcl, this is the record that was returned from the crs_stat command on my system:

      NAME=ora.orcl.db
      TYPE=application
      TARGET=ONLINE
      STATE=ONLINE on linux1

      I am safe here since the resource is running on linux1 and not linux3 - the node I want to remove.

      If, however, the database resource was running on linux3, we would need to relocate it to a node that we are going to keep (i.e. linux1) using the following:

      $ $ORA_CRS_HOME/bin/crs_relocate ora..db

   3. From a pre-existing node (i.e. linux1), remove the nodeapps from the node you are removing as the root UNIX user account:

      $ su
      Password: xxxxx

      # srvctl stop nodeapps -n linux3
      CRS-0210: Could not find resource ora.linux3.LISTENER_LINUX3.lsnr.

      # srvctl remove nodeapps -n linux3
      Please confirm that you intend to remove the node-level applications on node linux3 (y/[n]) y
      #

   4. The next step is to update the node list using the updateNodeList option to the OUI as the oracle user. This procedure will remove the node to be deleted from the list of node locations maintained by the OUI by listing only those remaining nodes. The only file that I know of that gets modified is $ORACLE_BASE/oraInventory/ContentsXML/inventory.xml. Here is the command I used for removing linux3 from the list. Notice that the DISPLAY variable needs to be set even though the GUI does not run.

      $ DISPLAY=:0.0; export DISPLAY

      $ $ORACLE_HOME/oui/bin/runInstaller -ignoreSysPrereqs -updateNodeList \
      ORACLE_HOME=/u01/app/oracle/product/10.1.0/db_1 \
      CLUSTER_NODES=linux1,linux2

      Note that the command above will produce the following error which can safely be ignored:

      PRKC-1002 : All the submitted commands did not execute successfully

   5. If the node to be removed is still available and running the CRS stack, the DBA will need to stop the CRS stack and remove the ocr.loc file. These tasks should be performed as the root user account and on the node that is to be removed from the cluster. The nosharedvar option assumes the ocr.loc file is not on a shared file system (which is the case in my example). If the file does exist on a shared file system, then specify sharedvar. From the node to be removed (i.e. linux3) and as the root user, run the following:

      $ su
      Password: xxxx

      # cd $ORA_CRS_HOME/install
      # ./rootdelete.sh remote nosharedvar
      Running Oracle10 root.sh script...
      \nThe following environment variables are set as:
          ORACLE_OWNER= oracle
          ORACLE_HOME=  /u01/app/oracle/product/10.1.0/crs
      Finished running generic part of root.sh script.
      Now product-specific root actions will be performed.
      Shutting down Oracle Cluster Ready Services (CRS):
      /etc/init.d/init.crsd: line 188: 29017 Aborted                 $ORA_CRS_HOME/bin/crsd -2

      Shutting down CRS daemon.
      Shutting down EVM daemon.
      Shutting down CSS daemon.
      Shutdown request successfully issued.
      Checking to see if Oracle CRS stack is down...
      Oracle CRS stack is not running.
      Oracle CRS stack is down now.
      Removing script for Oracle Cluster Ready services
      Removing OCR location file '/etc/oracle/ocr.loc'
      Cleaning up SCR settings in '/etc/oracle/scls_scr/linux3'

   6. Next, using the node name and CRS-assigned node number for the node to be deleted, run the rootdeletenode.sh command as follows. Keep in mind that this command should be run from a pre-existing / available node (i.e. linux1) in the cluster as the root UNIX user account:

      $ su
      Password: xxxx

      # cd $ORA_CRS_HOME/install
      # ./rootdeletenode.sh linux3,3
      Running Oracle10 root.sh script...
      \nThe following environment variables are set as:
          ORACLE_OWNER= oracle
          ORACLE_HOME=  /u01/app/oracle/product/10.1.0/crs
      Finished running generic part of root.sh script.
      Now product-specific root actions will be performed.
      clscfg: EXISTING configuration version 2 detected.
      clscfg: version 2 is 10G Release 1.
      Successfully deleted 13 values from OCR.
      Key SYSTEM.css.interfaces.nodelinux3 marked for deletion is not there. Ignoring.
      Successfully deleted 5 keys from OCR.
      Node deletion operation successful.
      'linux3,3' deleted successfully

      To verify that the node was successfully removed, use the following as either the oracle or root user:

      $ $ORA_CRS_HOME/bin/olsnodes -n
      linux1  1
      linux2  2

   7. Now, switch back to the oracle UNIX user account on the same pre-existing node (linux1) and run the runInstaller command to update the OUI node list, however this time for the CRS installation ($ORA_CRS_HOME). This procedure will remove the node to be deleted from the list of node locations maintained by the OUI by listing only those remaining nodes. The only file that I know of that gets modified is $ORACLE_BASE/oraInventory/ContentsXML/inventory.xml. Here is the command I used for removing linux3 from the list. Notice that the DISPLAY variable needs to be set even though the GUI does not run.

      $ DISPLAY=:0.0; export DISPLAY

      $ $ORACLE_HOME/oui/bin/runInstaller -ignoreSysPrereqs -updateNodeList \
      ORACLE_HOME=/u01/app/oracle/product/10.1.0/crs \
      CLUSTER_NODES=linux1,linux2

      Note that each of the commands above will produce the following error which can safely be ignored:

      PRKC-1002 : All the submitted commands did not execute successfully

      The OUI now contains the valid nodes that are part of the cluster!

   8. Now that the node has been removed from the cluster, the DBA should manually remove all Oracle10g RAC installation files from the deleted node. Obviously, this applies only if the removed node is still accessible and only if the files are not on a shared file system that is still being accessed by other nodes in the cluster!

      From the deleted node (linux3) I performed the following tasks as the root UNIX user account:
         1. Remove ORACLE_HOME and ORA_CRS_HOME:

            # rm -rf /u01/app/oracle/product/10.1.0/db_1
            # rm -rf /u01/app/oracle/product/10.1.0/crs

         2. Remove all init scripts and soft links (for Linux). For a list of init scripts and soft links for other UNIX platforms, see Metalink Note: 269320.1

            # rm -f /etc/init.d/init.cssd
            # rm -f /etc/init.d/init.crs
            # rm -f /etc/init.d/init.crsd
            # rm -f /etc/init.d/init.evmd
            # rm -f /etc/rc2.d/K96init.crs
            # rm -f /etc/rc2.d/S96init.crs
            # rm -f /etc/rc3.d/K96init.crs
            # rm -f /etc/rc3.d/S96init.crs
            # rm -f /etc/rc5.d/K96init.crs
            # rm -f /etc/rc5.d/S96init.crs
            # rm -Rf /etc/oracle/scls_scr

         3. Remove all remaining files:

            # rm -rf /etc/oracle
            # rm -f /etc/oratab
            # rm -f /etc/oraInst.loc
            # rm -rf /etc/ORCLcluster
            # rm -rf /u01/app/oracle/oraInventory
            # rm -rf /u01/app/oracle/product
            # rm -rf /u01/app/oracle/admin
            # rm -f /usr/local/bin/coraenv
            # rm -f /usr/local/bin/dbhome
            # rm -f /usr/local/bin/oraenv

         4. Remove all CRS/EVM entries from the file /etc/inittab:

            h1:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1

            h2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1

            h3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1


regards,
rajeshkumar g

No comments:

free counters
 
Share/Bookmark