Index Index for
Section 4
Index Alphabetical
listing for C
Bottom of page Bottom of
page

caa(4)

NAME

caa - Cluster Application Availability (CAA) information

SYNOPSIS

Application resource profile: NAME=resource_name TYPE=application [ACTION_SCRIPT=action_script] [ACTIVE_PLACEMENT={0|1}] [AUTO_START={0|1}] [CHECK_INTERVAL=check_interval] [DESCRIPTION=description] [FAILOVER_DELAY=failover_delay] [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] [HOSTING_MEMBERS=member_list] [OPTIONAL_RESOURCES=resource_list] [PLACEMENT=placement_policy] [REQUIRED_RESOURCES=resource_list] [RESTART_ATTEMPTS=restart_attempts] [SCRIPT_TIMEOUT=script_timeout] Network resource profile: NAME=resource_name TYPE=network [DESCRIPTION=description] [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] SUBNET=subnet_addr Tape resource profile: NAME=resource_name TYPE=tape [DESCRIPTION=description] DEVICE_NAME=device_name [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] Media Changer resource profile: TYPE=changer NAME=resource_name [DESCRIPTION=description] DEVICE_NAME=device_name [FAILURE_INTERVAL=failure_interval FAILURE_THRESHOLD=failure_threshold] DEVICE_NAME=device_name

OPERANDS

TYPE={application|network|tape|changer} Resource type. Specify either application, network, tape or changer. NAME=resource_name Resource name. Specify a resource_name as a string containing a combination of characters [a-z, A-Z, 0-9, '.','_']. The resource name may not start with a period (.). [DESCRIPTION=description] Resource description string. [CHECK_INTERVAL=check_interval] Time (in seconds) at which the check entry point of the application's action script runs. The check interval is the maximum amount of time an application can be unavailable to clients before CAA attempts to restart it. A check interval of 0 means never check the resource. [FAILURE_THRESHOLD=failure_threshold] Number of times CAA may detect a resource failure within the failure interval before it marks the resource unavailable and stops monitoring it. The value must be in the range 0-20. Setting the value to 0 (zero) turns off failure threshold monitoring. If you do not specify a failure threshold, CAA uses a default failure threshold of 0 (zero). [FAILURE_INTERVAL=failure_interval] Time (in seconds) during which the failure threshold is tallied and applied. If you do not specify a failure threshold, CAA uses a default failure interval of 0 (zero), which turns off failure threshold monitoring. Specifying a nonzero failure interval is meaningless unless failure threshold is also nonzero. [REQUIRED_RESOURCES=resource_list] Ordered list of resources, separated by white space, on which the application depends. These resources must be active on any member on which the application is running, or must be application resources that may be started on the cluster member. If you don't specify a required resources list, CAA imposes no required dependencies upon the application resource. CAA uses the required resources list, in conjunction with the placement policy and hosting members list, to determine which members are eligible to host the application resource. It also uses the required resources list to start required application resources when the caa_start command is run with the -f option. A failure of a required resource on the hosting member, will cause CAA to initiate relocation of the application if the failed resource is available or can be started on another member. This could cause CAA to fail the application resource over to another member that provides the resource or to stop the application if there is no member that provides the resource. In the latter case, CAA continues to monitor the required resources and restarts the application when the resource is again available in the cluster. [OPTIONAL_RESOURCES=resource_list] Ordered list of optional resources, separated by white space. CAA uses the optional resources list, in conjunction with the required resources list, placement policy, and hosting members list, to determine the optimal member to host the application resource when more than one member is eligible to host the resource. Optional resources must be in the state ONLINE on a cluster member to affect resource placement. The cluster member with the most optional resources is used to run the application. If the hosting members list is not empty, the cluster member in the list with the most optional resources is used. If the number of optional resources on cluster members is equal, the member running the resource with the earliest placement in the list is used to run the application. The number of optional resources per resource is limited to 58. A failure of an optional resource on the hosting member does not initiate application relocation. [PLACEMENT=placement_policy] Policy according to which CAA selects the member on which to start or restart the application resource. CAA uses the placement policy in conjunction with the resource's required list. You can specify any one of the following as a placement policy: balanced CAA favors starting or restarting the application resource on a member based on the optional resources listed, see OPTIONAL RESOURCES for more information. If there are no optional resources listed the member with currently running the fewest application resources is chosen. The balanced application resources are distributed equally among all active members if no optional resources are listed. favored CAA refers to the hosting members list before starting or restarting the application resource. First, a member on the hosting members list is chosen based on optional resources, see OPTIONAL RESOURCES for more information. If a member cannot be chosen based on optional resources, the first member on the list is most favored to run the service. If that member is unavailable, the second member on the list is the most favored, and so on. If all members on the hosting members list are unavailable, CAA favors placing the application resource on the member currently running the fewest application resources. You must specify a hosting members list when you select a favored placement policy. restricted Similar to the favored placement policy, except that if all members on the hosting members list are unavailable, CAA will not start or restart the application resource. A restricted placement policy ensures that the resource will never run on a member that is not on the list, even if you attempt to explicitly relocate it to that member. You must specify a hosting members list when you select a restricted placement policy. If you do not specify a placement policy, CAA uses a balanced placement policy for the application resource by default. [HOSTING_MEMBERS=member_list] Hosting members list. Specify an ordered list of members, separated by white space, that can host the application resource. If you specify a placement policy of favored or restricted, you must also specify a hosting members list. CAA uses the hosting members list in conjunction with the application resource's placement policy. After optional resources are considered, Applications are placed on hosts in the order in which they are listed in the hosting members list. [ACTIVE_PLACEMENT={0|1}] Reevaluates the placement of an application resource when a cluster member joins the cluster. [RESTART_ATTEMPTS=restart_attempts] Number of times CAA attempts to restart the resource on the current member before attempting to relocate it elsewhere. The default number of restart attempts is 1. [FAILOVER_DELAY=failover_delay] Number of seconds CAA waits before attempting to relocate the application resource due to a host failure. If the original cluster member becomes available to run the application resource within the FAILOVER_DELAY time, the application will restart on that member. The default failover delay is 0 (zero) seconds. [AUTO_START={0|1}] When set to 1, start the application resource automatically after a cluster reboot, regardless of whether it had been stopped or running before the reboot. When set to 0, start the application resource automatically only if it had been running before the reboot. The default is 0. [ACTION_SCRIPT=action_script] User-written action script for the application resource. The format of CAA action scripts is similar to that of system init files located in the /sbin/init.d directory. The script file performs user-defined tasks and can invoke other scripts and executable programs. An action script has the following entry points: start Called by CAA to start or restart the application resource. The start entry point executes all commands necessary to start the application and must return 0 (zero) for success and a nonzero value for failure. stop Called by CAA to stop a running application resource. It is not called when stopping an UNKNOWN application resource (see caa_stop(8) for details). The stop entry point executes all commands necessary to stop the application and must return 0 (zero) for success and a nonzero value for failure. The stop entry point should consider an attempt to stop an application that is not running a success and return 0 (zero). check Called by CAA periodically (according to the resource's check interval) to determine the health of the application resource. The check entry point executes all commands necessary to determine whether the application is still running and must return 0 (zero) for success and a nonzero value for failure. If the check entry point returns failure, CAA initiates relocation for the application resource. You can specify either a full pathname for the script file, or its filename (in which case CAA looks for the file in the /var/cluster/caa/script directory). If you do not specify an action script, CAA looks for an action script named /var/cluster/caa/script/resource_name.scr. [SCRIPT_TIMEOUT=script_timeout] The maximum time for an action script to execute. An error message is returned if the script does not finish executing within the time (in seconds) specified. The timeout applies to all action script entry points (start, stop, and check). If this value is not specified CAA assumes a default value of 60 seconds. SUBNET=subnet_addr Subnet address of a network resource. Specify the subnet address in xxx.xxx.xxx.xxx format (for example, 16.140.112.0). The network is the bitwise AND of the IP address and the netmask. If you consider IP address of 16.69.225.12 and a netmask of 255.255.255.0 the subnet will be 16.69.225.0 DEVICE_NAME=device_name Device name of a tape or media changer device. Specify either the full path of the device (for example, /dev/tape/tape1) or just the device name.

DESCRIPTION

CAA tracks the state of the members in a cluster and resources in a cluster (such as networks and applications). CAA monitors the requirements of application resources in a cluster and ensures that applications run on members that meet their requirements. If the cluster member on which an application is running fails, or if a particular resource that another resource requires fails, CAA relocates the application to another member that has the required resources available. CAA can start a group of application resources with one call of the caa_start command. CAA will start all required application resources in the order they are listed in the resource profile if they are available to be started. CAA allows you to enhance overall application performance by balancing application execution among a set of available cluster members. CAA manages application, network, tape, and changer resources. You must have root privileges to use most CAA commands. Only, the caa_stat command does not require root priviliges. CAA consists of components that work together to make application resources highly available and monitor other resources: · A resource manager comprised of the run-time CAA daemons (caad) on all cluster members. The resource manager starts, stops, relocates, and restarts application resources when failure conditions occur. · Resource monitors that are used to check on the state of a particular type of resource. Resource monitors are located in the directory /var/cluster/caa/monitors. · A user interface that allows you to manage application and network resources in a cluster. The commands available with the command-line interface are listed in the SEE ALSO section of this reference page. The SysMan menu provides a graphical user interface (GUI) for performing system management tasks for the cluster, cluster members, and CAA applications. For more information on using SysMan, see sysman(8) and the online help available for the sysman application. · Resources that are managed and monitored by CAA. A resource is defined by its resource profile. A resource profile defines to CAA how a application or network resource should run in a cluster. The caa_profile command creates new resource profiles, either with default values or fully customized according to command-line values. It can also validate, update, or delete profiles. · Action scripts associated with resources that are used by CAA to start and stop the application resources. A resource profile is an ASCII text file that assigns values to attributes that define how a resource should be managed or monitored in a cluster. The attributes described in the SYNOPSIS and OPERANDS section of this reference page make up a profile. The line for any profile attribute (including continuation lines) cannot exceed 2047 bytes in length. Create a resource profile by using the caa_profile(8) command, the Cluster Application Availability (CAA) Management branch under TruCluster Specific on the SysMan Menu, or a text editor. The type of resource (application, network, tape, or changer determines which keywords and operands you can specify in its profile. Profiles are written to the /var/cluster/caa/profile directory by caa_profile. CAA expects all resource profiles to be in the /var/cluster/caa/profile directory. When you create a resource profile with a text editor, you can omit optional operands and keywords. There are default values for most keywords. Resources must be registered with CAA, using the caa_register command, after a profile is created. CAA can only begin to monitor and manage a resource after it has been registered. After a resource has been registered, the information in the profile is contained in the binary CAA registry database. You can also update a resource profile with a text editor. Any time you edit a profile by hand, you should validate the profile with the caa_profile -validate command to check that the profile is syntactically correct. Using the caa_register -u command, you can then update the resource while the resource remains online. If you change the profile and do not update the registration, the Registry Database will not contain the new profile information. Only certain keyword settings can be updated: You cannot update the NAME or TYPE of any resource. You can update: ACTION_SCRIPT Changes to the action script location and contents take effect the next time CAA uses the script. DESCRIPTION Changes to the description take place immediately. HOSTING_MEMBERS Changes to the hosting members list take place the next time the placement policy is executed. REQUIRED_RESOURCES Changes to the required resource list take place the next time the placement policy is executed. OPTIONAL_RESOURCES Changes to the optional resource list take place the next time the placement policy is executed. PLACEMENT Changes to the placement policy take place the next time the placement policy is executed. AUTO_START Changes to auto-start take effect after the next cluster reboot. CHECK_INTERVAL Changes to the check interval take effect immediately and reset the check interval timer. FAILURE_THRESHOLD Changes to the failure threshold take effect at the next failure. FAILURE_INTERVAL Changes to the failure interval take effect at the next failure. RESTART_ATTEMPTS Changes to the restart attempts take place at the next failure. FAILOVER_DELAY Changes to the failover delay take effect at the next failure. CAA does extensive logging of its actions to both the command line and the EVM event management system. To monitor CAA related EVM events, see the examples below. See the EVM(5) reference page for details on how to use the EVM event management system.

USER DEFINED ATTRIBUTES

The format of resource profiles can be extended with user-defined attributes. These user-defined attributes can be accessed within the resource action script as an environment variable. A user-defined attribute first must be defined in the resource type definition file located at /var/cluster/caa/template/[type].tdf. The values that must be defined are as follows. attribute: Defines the attribute for which a user can specify a value. This attribute translates to an environment variable accessible in all application resource action scripts. type: Defines the type of values that are allowed for this attribute. Types include: boolean, string, name_list, name_string, positive_integer, internet_address, file. switch: Defines the switch used with the caa_profile command to specify a profile value. default: Defines the default value for this attribute, if it is not specified in the profile. required: Defines whether the switch must be specified in a profile or not. A user-defined attribute can be specified on the command line of caa_start, caa_relocate, or caa_stop as well as in a profile. The value specified on the command line will override any value specified in a profile. For more information see caa_start(8), caa_relocate(8), or caa_stop(8). There are a number of environment variables that can be accessed within an action script. These include all profile attributes, reason codes, locale information, and any user-defined attributes. The CAA defined profile attributes can be accessed as an environment variable in any action script by prefixing _CAA_ to the attribute name. For example, the AUTO_START value would be obtained using _CAA_AUTO_START in the script. Reason codes describe the reason that an action script was executed. The environment variable _CAA_REASON can have one of the following reason code values: user Action script was invoked due to a user initiated command, such as caa_start, caa_stop, or caa_relocate. failure Action script was executed because of a failure condition. A typical condition that sets this value is a check script failure. dependency Action script was invoked as a dependency of another resource that has a failure condition. boot Action script was invoked as a result of an initial cluster boot (resource was running in a prior system invocation). autostart The resource is being autostarted. If the AUTOSTART profile attribute is set to 1, autostart occurs at cluster boot time if the resource was previously offline on the cluster before the last shut down. system Action script was initiated by the system due to normal maintenance, for example, the check script initiates a relocation. unknown Internally unknown state when the script was invoked. If this value occurs, you should file a bug report detailing the cluster and application states. The locale of the environment where a CAA command invokes an action script is available to the action script in the _CAA_CLIENT_LOCALE environment variable. This variable contains the following locale information in a string value seperated by spaces: LC_ALL, LC_CTYPE, LC_MONETARY, LC_NUMERIC, LC_TIME, LC_MESSAGES. The action script can use this information, if desired, to set the locale in the action script environment. See setlocale(3) and locale(1) for more information on locale.

EXAMPLES

The following is an example of an application resource profile: TYPE = application NAME = clock CHECK_INTERVAL = 60 FAILURE_THRESHOLD = 0 FAILURE_INTERVAL = 0 REQUIRED_RESOURCES = OPTIONAL_RESOURCES = HOSTING_MEMBERS = PLACEMENT = balanced RESTART_ATTEMPTS = 1 FAILOVER_DELAY = 0 AUTO_START = 0 ACTION_SCRIPT = clock.scr SCRIPT_TIMEOUT = 60 ACTIVE_PLACEMENT = 0 The following is an example of a network resource: TYPE = network NAME = net1 CHECK_INTERVAL = 60 FAILURE_THRESHOLD = 0 FAILURE_INTERVAL = 0 SUBNET = 16.140.112.0 An example entry in the type definition file is below: #!========================== attribute: USR_DEBUG type: boolean switch: -o d default: 0 required: no An example of how to break down the locale variables from within an action script is below: echo $_CAA_CLIENT_LOCALE |\ read LC_ALL LC_CTYPE LC_MONETARY LC_NUMERIC LC_TIME LC_MESSAGES For examples of action scripts see the directory /var/cluster/caa/script or /var/cluster/caa/examples. To monitor CAA events on the console, use the following command: # evmwatch | evmshow -f "[name *.caa.*]" To view events related to CAA that have been sent to the EVM Event Management System: # evmget | evmshow -f "[name *.caa.*]"

SEE ALSO

Commands: caa_profile(8), caa_register(8), caa_relocate(8), caa_start(8), caa_stat(1), caa_stop(8), caa_unregister(8) Daemon: caad(8) Files: /var/cluster/caa/script, /var/cluster/caa/profile, /var/cluster/caa/registry, /var/cluster/caa/monitors, /var/cluster/caa/examples TruCluster Server Cluster Administration

Index Index for
Section 4
Index Alphabetical
listing for C
Top of page Top of
page