Index Index for
Section 5
Index Alphabetical
listing for M
Bottom of page Bottom of
page

memory_trolling(5)

NAME

memory_trolling - Proactively locates and scrubs correctable memory errors

SYNOPSIS

/etc/sysconfigtab vm: vm_troll_percent = percent_rate

DESCRIPTION

The operating system handles memory errors with a just-in-time scrubbing model, where correctable errors are scrubbed when encountered by the operating system or an application. To enhance this capability, a trigger mechanism, called the memory troller proactively locates and scrubs correctable memory errors. The memory troller systematically reads each memory location. If it discovers a correctable memory error, it triggers the just-in-time scrubbing mechanism. Since the memory troller reads all memory available to the operating system, it might also discover uncorrectable memory errors, which would lead to a unrecoverable machine check. To avoid this, the operating system recognizes that the machine check resulted from memory trolling, dismisses the error, and continues normal operation. The memory troller then causes the memory page containing the uncorrectable error to be marked as a bad page. If the bad page is free (or when it becomes free) it is then mapped out so it will not be reused. Enabling, Disabling, and Tuning Memory Trolling For systems supported by the memory troller, use the vm_troll_percent variable to enable, disable, and tune the trolling rate. This parameter is part of the kernel's vm subsystem. The trolling rate is expressed as a percentage of the system's total memory trolled per hour and can be changed at any time. Valid troll rate settings are as follows: Default value: 4 percent per hour This value is used by default if you do not specify any value for vm_troll_percent. At this default rate, each 8 kilobyte memory page is trolled once every 24 hours. Disable value: 0 (zero) A value of zero disables the memory troller. Range: 1 - 100 percent The troll rate is set to the specified percentage of memory to troll per hour. For example, a 50 percent troll rate reads half the total memory in one hour. After all memory is read, the troller starts a new pass at the beginning of memory. Accelerated trolling: 101 percent Any value greater than 100 percent invokes one-pass accelerated trolling. All memory is trolled at a rate of approximately 6000 8 kilobyte pages per second, then trolling is disabled. This mode is intended for trolling all memory quickly during off peak hours. For example, on a GS320 system with 32 processors and 128 gigabytes of memory, one-pass accelerated trolling takes approximately five minutes. Use the following command to display the current value of vm_troll_percent (the troll rate): # /sbin/sysconfig -q vm vm_troll_percent You can override the default troll rate by adding the following lines to the /etc/sysconfigtab file: vm: vm_troll_percent=percent_rate The percent_rate variable is the troll rate as described previously. Use the sysconfigdb command to add entries to the /etc/sysconfigtab file, as described in the sysconfigdb(8) reference page. The new rate takes effect on the next system boot. You can enable, disable, or change the troll rate at any time using the following command: # /sbin/sysconfig -r vm vm_troll_percent=percent_rate The variable is the troll rate as described previously. Only the superuser (root) or a user authorized by division of privileges (dop) can use this command. (Refer to the dop(8) reference page for information on sharing superuser privileges.) See MESSAGES for information on configuration messages Controlling the Use of System Resources Low trolling rates, such as the 4 percent default, have negligible impact on system performance. Processor usage for memory trolling increases as the troll rate is increased. To approximate the performance overhead, use the following procedure: 1. Log in as root or become superuser. 2. Choose a time when the system is idle and disable the memory troller using the following command: # /sbin/sysconfig -r vm vm_troll_percent=0 3. To establish a performance baseline, run the following command with the memory troller disabled: # vmstat 1 ...cpu... ...us sy id... ... 1 1 98... 4. In the command output, note the system time, labeled sy under the cpu heading. Adjust the value of vm_troll_percent using the following command: # /sbin/sysconfig -r vm vm_troll_percent=percent_rate Repeat step 3 and note any change in the value of sy under the cpu heading. A system time (sy) increase of one or less represents negligible performance cost. Repeat the procedure, adjusting the percent value of vm_troll_percent until the performance cost is acceptable. For example, a GS320 system with 32 processors and 128 GB of memory will show approximately 25 percent of system time during one-pass accelerated trolling. The same system at the 4 percent default troll rate will show one percent or less system time.

MESSAGES

Configuration Messages If the memory troller does not support your system, the following error is displayed on your terminal when you attempt to configure the memory troller using /sbin/sysconfig: vm_configure: Memory Trolling not supported on this system. You can disable trolling using the following command: # /sbin/sysconfig -r vm vm_troll_percent=0 The following warning message is displayed on your terminal when the preceding command is executed: vm_configure: shutting down memory troller. [WARNING: disabling the memory troller is not recommended on this system.] This message notifies you that permanently disabling memory trolling is not recommended. Informational Messages The following messages provide information about events associated with memory troller operation. These messages do not indicate a failure in the memory troller: · If a memory page containing a uncorrectable error was located by the memory troller and the bad page will be mapped out, the following message is displayed: Memory Troller: bad page found (address = 0x################) · In addition to the bad page found... message, machine check messages similar to the following are displayed on the system's console when the memory troller encounters a bad page: 25-Mar-2000 17:24:25 [700] CPU machine check/exception - CPU 0 25-Mar-2000 17:24:25 [700] CPU machine check/exception - CPU 18 These messages come from the event notification subsystem. They indicate that the machine checks resulting from the memory troller reading the bad page have been entered into the binary error log. Error Messages If any of the following error messages are displayed on the console terminal, a malfunction has occurred in the memory troller and you must contact your technical support organization. · VM_CONFIGURE: Memory Trolling is currently disabled on this system The memory troller has been disabled due to a fatal error. · adjust_troll_quantity: null MAD pointer, disabling troller A fatal internal error has occurred, the troller is disabled. · adjust_troll_quantity: invalid troll_percent 0 defaulting to 4 percent The troller is active, but the troll rate is zero. The troller continues operating, but at the default troll rate. This is a serious error. · vm_memory_troller: CPU # vmmt_get_mad() failed, disabling troller A fatal internal error has occurred, the troller is disabled. · vm_memory_troller: MAD # invalid state [#], shutting down A fatal internal error has occurred, the troller is disabled.

EXAMPLES

The following examples demonstrate typical command use and settings for the memory troller: 1. To schedule one-pass accelerated trolling at off peak hours, use the following procedure: a. Create a shell script named /usr/local/fast_troll.sh containing the following lines: #!/sbin/sh /sbin/sysconfig -r vm vm_troll_percent=101 b. Using the following commands, set the file owner and permissions of /usr/local/fast_troll.sh: # chown root /usr/local/fast_troll.sh # chmod 744 /usr/local/fast_troll.sh c. Use the cron facility to schedule execution of the shell script as root user at the desired time. (Refer to the cron(8) reference page for more information.) 2. The following command demonstrates how you can set trolling at a more aggressive rate of 50 percent per hour: # /sbin/sysconfig -r vm vm_troll_percent=50 As such dynamic changes are not recorded in the /etc/sysconfigtab file, this setting will not persist across a reboot. 3. The following method describes how you use a stanza file to change the value of vm_troll_percent to 10 so that the change is updated in the kernel immediately and also persists across a reboot: a. Create a stanza file containing the following lines: vm: vm_troll_percent=10 Save this file as /tmp/vm_troller.stanza. b. Use the following command to merge the stanza in the /etc/sysconfigtab file: #/sbin/sysconfigdb -a -f /tmp/vm_troller.stanza vm

FILES

/etc/sysconfigtab The configuration database file in which you specify the value of vm_troll_percent under the vm attributes. See the sysconfigtab(4) reference page for more information. /sbin/sysconfig The command that you use to dynamically set the value of vm_troll_percent under the vm attributes in the /etc/sysconfigtab file. See the sysconfig(8) reference page for more information.

SEE ALSO

Commands: sys_attrs_vm(5), sysconfig(8), sysconfigdb(8), and vmstat(1)

Index Index for
Section 5
Index Alphabetical
listing for M
Top of page Top of
page