Using srvctl to Manage your 10g RAC Database
Oracle recommends that RAC databases be managed with srvctl, an Oracle-supplied tool that was first introduced with 9i RAC. The 10g version of srvctl is slightly different from the 9i implementation. In this article, we will look at how -- and why -- to manage your 10g databases with srvctl.
RAC Architecture Overview
Let's begin with a brief overview of RAC architecture.
- A cluster is a set of 2 or more machines (nodes) that share or coordinate resources to perform the same task.
- A RAC database is 2 or more instances running on a set of clustered nodes, with all instances accessing a shared set of database files.
- Depending on the O/S platform, a RAC database may be deployed on a cluster that uses vendor clusterware plus Oracle's own clusterware (Cluster Ready Services), or on a cluster that solely uses Oracle's own clusterware.
Thus, every RAC sits on a cluster that is running Cluster Ready Services. srvctl is the primary tool DBAs use to configure CRS for their RAC database and processes.
Cluster Ready Services and the OCR
Cluster Ready Services, or CRS, is a new feature for 10g RAC. Essentially, it is Oracle's own clusterware. On most platforms, Oracle supports vendor clusterware; in these cases, CRS interoperates with the vendor clusterware, providing high availability support and service and workload management. On Linux and Windows clusters, CRS serves as the sole clusterware. In all cases, CRS provides a standard cluster interface that is consistent across all
platforms.
CRS consists of four processes (crsd, occsd, evmd, and evmlogger) and two disks: the Oracle Cluster Registry (OCR), and the voting disk.
CRS manages the following resources:
- The ASM instances on each node
- Databases
- The instances on each node
- Oracle Services on each node
- The cluster nodes themselves, including the following processes, or "nodeapps":
- VIP
- GSD
- The listener
- The ONS daemon
CRS stores information about these resources in the OCR. If the information in the OCR for one of these resources becomes damaged or inconsistent, then CRS is no longer able to manage that resource. Fortunately, the OCR automatically backs itself up regularly and frequently.
Interacting with CRS and the OCR: srvctl
srvctl is the tool Oracle recommends that DBAs use to interact with CRS and the cluster registry. Oracle does provide several tools to interface with the cluster registry and CRS more directly, at a lower level, but these tools are deliberately undocumented and intended only for use by Oracle Support. srvctl, in contrast, is well documented and easy to use. Using other tools to modify the OCR or manage CRS without the assistance of Oracle Support runs the risk of damaging the OCR.
Using srvctl
Even if you are experienced with 9i srvctl, it's worth taking a look at this section; 9i and 10g srvctl commands are slightly different.
srvctl must be run from the $ORACLE_HOME of the RAC you are administering. The basic format of a srvctl command is
srvctl
where command is one of
enable|disable|start|stop|relocate|status|add|remove|modify|getenv|setenv|unsetenv|config
and the target, or object, can be a database, instance, service, ASM instance, or the nodeapps.
The srvctl commands are summarized in this table:
Command | Targets | Description |
---|---|---|
srvctl add srvctl modify srvctl remove | database instance service nodeapps | srvctl add / remove adds/removes target's configuration information to/from the OCR. srvctl modify allows you to change some of target's configuration information in the OCR without wiping out the rest. |
srvctl relocate | service | Allows you to reallocate a service from one named instance to another named instance. |
srvctl config | database service nodeapps asm | Lists configuration information for target from the OCR. |
srvctl disable srvctl enable | database instance service asm | srvctl disable disables target, meaning CRS will not consider it for automatic startup, failover, or restart. This option is useful to ensure an object that is down for maintenance is not accidentally automatically restarted. srvctl enable reenables the specified object. |
srvctl getenv srvctl setenv srvctl unsetenv | database instance service nodeapps | srvctl getenv displays the environment variables stored in the OCR for target. srvctl setenv allows these variables to be set, and unsetenv unsets them. |
srvctl start srvctl status srvctl stop | database instance service nodeapps asm | Start, stop, or display status (started or stopped) of target. |
As you can see, srvctl is a powerful utility with a lot of syntax to remember. Fortunately, there are only really two commands to memorize: srvctl -help displays a basic usage message, and srvctl -h displays full usage information for every possible srvctl command.
Examples
Example 1. Bring up the MYSID1 instance of the MYSID database.
[oracle@myserver oracle]$ srvctl start instance -d MYSID -i MYSID1
Example 2. Stop the MYSID database: all its instances and all its services, on all nodes.
[oracle@myserver oracle]$ srvctl stop database -d MYSID
Example 3. Stop the nodeapps on the myserver node. NB: Instances and services also stop.
[oracle@myserver oracle]$ srvctl stop nodeapps -n myserver
Example 4. Add the MYSID3 instance, which runs on the myserver node, to the MYSID
clustered database.
[oracle@myserver oracle]$ srvctl add instance -d MYSID -i MYSID3 -n myserver
Example 4. Add a new node, the mynewserver node, to a cluster.
[oracle@myserver oracle]$ srvctl add nodeapps -n mynewserver -o $ORACLE_HOME -A
149.181.201.1/255.255.255.0/eth1
(The -A flag precedes an address specification.)
Example 5. To change the VIP (virtual IP) on a RAC node, use the command
[oracle@myserver oracle]$ srvctl modify nodeapps -A new_address
Example 6. Find out whether the nodeapps on mynewserver are up.
[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserver
VIP is running on node: mynewserver
GSD is running on node: mynewserver
Listener is not running on node: mynewserver
ONS daemon is running on node: mynewserver
Example 7. Disable the ASM instance on myserver for maintenance.
[oracle@myserver oracle]$ srvctl disable asm -n myserver
Debugging srvctl
Debugging srvctl in 10g couldn't be easier. Simply set the SRVM_TRACE environment variable.
[oracle@myserver bin]$ export SRVM_TRACE=true
Let's repeat Example 6 with SRVM_TRACE set to true:
[oracle@myserver oracle]$ srvctl status nodeapps -n mynewserver
/u01/app/oracle/product/10.1.0/jdk/jre//bin/java -classpath
/u01/app/oracle/product/10.1.0/jlib/netcfg.jar:/u01/app/oracle/product/10.1.0/jdk/jre//lib/rt.jar:
/u01/app/oracle/product/10.1.0/jdk/jre//lib/i18n.jar:/u01/app/oracle/product/10.1.0/jlib/srvm.jar:
/u01/app/oracle/product/10.1.0/jlib/srvmhas.jar:/u01/app/oracle/product/10.1.0/jlib/srvmasm.jar:
/u01/app/oracle/product/10.1.0/srvm/jlib/srvctl.jar
-DTRACING.ENABLED=true -DTRACING.LEVEL=2 oracle.ops.opsctl.OPSCTLDriver status nodeapps -n
mynewserver
[main] [19:53:31:778] [OPSCTLDriver.setInternalDebugLevel:165] tracing is true at level 2 to
file null
[main] [19:53:31:825] [OPSCTLDriver.:94] Security manager is set
[main] [19:53:31:843] [CommandLineParser.parse:157] parsing cmdline args
[main] [19:53:31:844] [CommandLineParser.parse2WordCommandOptions:900] parsing 2-word
cmdline
[main] [19:53:31:866] [GetActiveNodes.create:212] Going into GetActiveNodes constructor...
[main] [19:53:31:875] [HASContext.getInstance:191] Module init : 16
[main] [19:53:31:875] [HASContext.getInstance:216] Local Module init : 19
...
[main] [19:53:32:285] [ONS.isRunning:186] Status of ora.ganges.ons on mynewserver is true
ONS daemon is running on node: mynewserver
[oracle@myserver oracle]$
Pitfalls
A little impatience when dealing with srvctl can corrupt your OCR, ie, put it into a state where the information for a given object is inconsistent or partially missing. Specifically, the srvctl remove command provides the -f option, to allow you to force removal of an object from the OCR. Use this option judiciously, as it can easily put the OCR into an inconsistent state.
Restoring the OCR from an inconsistent state is best done with the assistance of Oracle Support, who will guide you in using the undocumented $CRS_HOME/bin/crs_* tools to repair it. The OCR can also be restored from backup.
Error messages
srvctl errors are PRK% errors, which are not documented in the 10gR1 error messages manual. However, for those with a Metalink account, they are documented on Metalink here.
Conclusion
srvctl is a powerful tool that will allow you to administer your RAC easily and effectively. In addition, it provides a valuable buffer between the DBA and the OCR, making it more difficult to corrupt the OCR.
No comments:
Post a Comment