In this example, we report high CPU usage--unusually low CPU idle times.
The LowCPUIdleTime script might send an alert message like the following:
-------------------------------------------------------------------------------
PIKT ALERT
Sun Jan 11 19:42:13 2004
montreal
URGENT:
LowCPUIdleTime
Report unusually low CPU idle time
Unusually low CPU idle time (42.1%): Cpu(s): 27.2% user, 23.5% system,
7.2% nice, 42.1% idle
top - 14:50:34 up 8 days, 1:49, 8 users, load average: 9.28, 3.38, 0.21
Tasks: 146 total, 1 running, 145 sleeping, 0 stopped, 0 zombie
Cpu(s): 27.2% user, 23.5% system, 7.2% nice, 42.1% idle
Mem: 449844k total, 436864k used, 12980k free, 25000k buffers
Swap: 1052248k total, 91416k used, 960832k free, 147196k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8442 root 16 0 876 876 672 R 3.9 0.2 0:00.05 top
1 root 15 0 88 72 52 S 0.0 0.0 0:05.25 init
2 root 15 0 0 0 0 S 0.0 0.0 0:02.20 keventd
3 root 15 0 0 0 0 S 0.0 0.0 0:00.02 kapmd
...
-------------------------------------------------------------------------------
The script follows.
///////////////////////////////////////////////////////////////////////////////
//
// process_alarms.cfg
//
///////////////////////////////////////////////////////////////////////////////
[other alarms omitted...]
///////////////////////////////////////////////////////////////////////////////
LowCPUIdleTime
init
status active
level urgent
task "Report unusually low CPU idle time"
input proc "=top -b -d 1 -n 1 | egrep -i '^cpu'"
dat "([[:digit:]]+\\.[[:digit:]]%)[[:space:]]idle"
begin // set the threshold
set #idlelimit = 50%
rule // set the idle time
set #idletime = #val($1)
#ifdef debug
rule
output "\#idlelimit = $text(#idlelimit,3)"
output "\$inlin is $inlin"
output "\#val(\$1) is $text(#val($1),3)"
output "\#idletime is $text(#idletime,3)"
output =newline
#endifdef
rule // for diagnostic purposes
output log "=cpuidletime_log" $inline
// if we ever need to check this on a per-machine (or per-
// hostgroup) basis, we really should set up a new objects file,
// CPUIdleTime.obj, with fields like so:
//
// //host //idletime
//
// then read the data in using =readvals() and process in the usual
// manner
rule // report unusually low cpu idle time
if #idletime < #idlelimit
output mail "Unusually low CPU idle time
($text(#idletime*100,1)%): $inlin"
output mail =newline
=outputproc(mail, "=top -b -d 1 -n 1")
fi
///////////////////////////////////////////////////////////////////////////////
[other alarms omitted...]
///////////////////////////////////////////////////////////////////////////////
[For more examples, see Samples.]
Home |
FAQ |
News |
Intro |
Samples |
Tutorial |
Reference |
Software |
Authors |
Licensing |
SiteSearch
Links |
SiteIndex |
Pikt-Users |
Pikt-Workers |
Contribute |
ContactUs |
Top of Page
Page best viewed at 1024x768.
Page last updated 2005-06-22.
This site is
PIKT®
powered.
PIKT® is a registered trademark of the University of Chicago.
Copyright © 1998-2005 Robert Osterlund. All rights reserved.
|