In this sample alerts.cfg file, we specify the what, where, and when of alarm scripts--which alarm scripts to run, where and when to run them, at what priority ("nice" level), and where to send their output (typically e-mail).
In this example, we group alarms mainly by importance: Emergency, Urgent, Critical, Warning, etc. You may do it this way, and/or you may group alarms by department (for example, Marketing, Development, Finance, etc.), by personnel (for example, AlertsTom, AlertsDick, AlertsHarry, etc.), by functionality (for example, Security, Backups, Users, Patches, etc.), by timing (for example, Morning, Evening, Overnight, Hourly, Daily, etc.), and so on. Do what makes sense in your situation.
This is a rather elaborate samples alerts.cfg file, with many different alarm scripts and timing subtleties. Especially for smaller organizations, a typical alerts.cfg might be much simpler than this.
///////////////////////////////////////////////////////////////////////////////
//
// PIKT alerts.cfg -- grouping and scheduling alarm and program scripts
//
///////////////////////////////////////////////////////////////////////////////
//
// (please see the comments prefacing the sample macros.cfg about
// configuration file complexity and parse error debugging)
//
///////////////////////////////////////////////////////////////////////////////
//
// when ordering your alarms, put the most important at the head of the
// list so that they will appear at the top of any emailed alerts
//
///////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////
EMERGENCY // things that require immediate attention
#ifndef test
# if hamburg | misscritsys // & ! moscow
timing 10,25,40,55 6-18 * * 1-5 5 // mon-fri, day hrs
10 6-18 * * 0,6 5 // sun,sat, day hrs
10 0-5,19-23 * * * 5 // each day, nite hrs
# else
timing 10,40 6-18 * * 1-5 5 // mon-fri, day hrs
10 6-18/2 * * 0,6 5 // sun,sat, day hrs
10 0,2,4,20,22 * * * 5 // each day, nite hrs
# endif
#elsedef
timing 10 8-16/2 * * *
#endifdef // test
#if moscow | nismaster
priority 5
#else
priority 0
#endif
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
EMERGENCY' =pikt-emergency"
lpcmd "=lp =piktprinter"
alarms
#if piktmaster
SysDownEmergency
SysBakSubnetDownEmergency
#endif
LoadAverageEmergency
#if nisserver | mailserver
NISBrokenEmergency
#endif
#if solaris
PerUserProcessCountsEmergency
CPUUsageEmergency
SuLogScanEmergency
SwapExhaustedEmergency
#endif
DevSystemNotExistEmergency
DirSystemNotExistEmergency
SystemFileNotExistEmergency
DiskCapEmergency
DiskInodeEmergency
#if moscow
RaidStatusNotOptimalEmergency
#endif
///////////////////////////////////////////////////////////////////////////////
Urgent // things that deserve nearly immediate attention
#ifndef test
# if misscritsys
timing 20,50 6-18 * * 1-5 5 // mon-fri, day hrs
20 6-18/2 * * 0,6 5 // sun,sat, day hrs
20 0,2,4,20,22 * * * 5 // each day, nite hrs
# else
timing 20 6-18 * * 1-5 5 // mon-fri, day hrs
20 6-18/4 * * 0,6 5 // sun,sat, day hrs
20 2,22 * * * 5 // each day, nite hrs
# endif
#elsedef
timing 20 8-16/2 * * *
#endifdef // test
#if moscow | nismaster
priority 10
#else
priority 0
#endif
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Urgent' =pikt-urgent"
lpcmd "=lp =piktprinter"
alarms
#if piktmaster
SysDownUrgent
#endif
ProcessSystemDeadUrgent
PiktcSvcLogScanUrgent
SysRebootUrgent
LoadAverageUrgent
PasswdFileProblemsUrgent
DirSystemStatChangeUrgent
DiskCapUrgent
UmountDiskUrgent
RootCoreFileExistUrgent
LogUpdatesUrgent
FileUpdatesUrgent
MailQueueLengthyUrgent
MessagesScanUrgent
FileExistWarnUrgent
SystemFileSizeChangeUrgent
#if solaris
RunawayProcUrgent
CPUUsageUrgent
NewSystemStartupFileUrgent
ShadowFileProblemsUrgent
SwapLowUrgent
#endif
#if solaris
ProcZombieTotalCountsUrgent
#endif
#if solaris
# if milan
CronLogScanUrgent
# endif
#endif
#if solaris
LpHungUrgent
#endif
MountDiskUrgent
#if nisclient
NISNoBindingUrgent
#endif
#if nismaster
YPPasswdFileSizeChangeUrgent
#endif
#if nismaster | piktmaster
YPPasswdFileProblemsUrgent
#endif
#if ! uppsala
SyslogScanUrgent
#endif
#if moscow
RelaySpoolStatUrgent
#endif
///////////////////////////////////////////////////////////////////////////////
Critical // things that should be dealt with before too long,
// preferably by day's end; (things reported here
// may not be especially "critical" but are so
// designated to conform with syslog's log levels)
#ifndef test
# if misscritsys
timing 30 6-22/2 * * 1-5 5 // mon-fri, day&nite hrs
30 6-22/4 * * 0,6 5 // sun,sat, day&nite hrs
# else
timing 30 6-18/2 * * 1-5 5 // mon-fri, day&nite hrs
// sun,sat, no hrs (no run)
# endif
#elsedef
timing 30 8-16/2 * * *
#endifdef // test
#if moscow | nismaster
priority 10
#else
priority 0
#endif
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Critical' =pikt-critical"
alarms
#if piktmaster
PiktDaemonProblemsCritical
#endif
ChecksumDifferenceCritical
AfterHoursUserActivityLogScanCritical
DiskCapCritical
TmpFullCritical
FileAgeWarningHstCritical
#if solaris
ShadowFileProblemsCritical
#endif
#if moscow
ClassAliasProblemsCritical
#endif
#if nismaster
YPPasswdAcctPasswordChangeCritical
#endif
DevSystemStatChangeCritical
SystemFileStatChangeCritical
DirsFileStatCritical
#endif
#if warsaw
PiktWebPageChangeCritical
#endif
#if moscow
ForwardingBrokenCritical
#endif
#if madrid
KillIdleUserSessionCritical // supplementing idled
#endif
#if ! murmansk
AuthLogScanCritical
#endif
#if solaris
IostatLogInfo
#endif
///////////////////////////////////////////////////////////////////////////////
Warning // things that need attention, if not today, then
// eventually; after looking at warning alerts, we
// often just delete them at the end of the day,
// clearing the deck for the next day's warnings
#ifndef test
# if moscow
timing 40 3 * * * 10 // give extra time for BakMail to
// finish
# else
timing 40 2 * * * 10 // run after the completion of
// Info, which see
# endif
#elsedef
timing 40 8-16/2 * * *
#endifdef // test
nice 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Warning' =pikt-warning"
alarms
# if piktmaster
SysDownWarning
SysBakSubnetDownWarning
# endif
DumpDatesProblemsWarning
PasswdFileProblemsWarning
DiskCapWarning
DiskInodeWarning
QueuedMailOldWarning
QueuedMailLongWarning
RootMailFileExistWarning
AliasesFileProblemsWarning
// FileMtimeChangeWarning
#if solaris
SuLogScanWarning
BackupDevStatWarning
PasswdShadowCrosscheckWarning
FileCtimeChangeWarning
OrphanedPrintFilesWarning
#endif
#if nisclient & solaris
PasswdFileNISProblemsWarning
ShadowFileNISProblemsWarning
YPServersFileWarning
#endif
#if nismaster
YPAliasesBadAddressWarning
YPGroupFileProblemsWarning
NISHostsProblemsWarning
#endif
#if ! ( murmansk | firenze )
GroupFileProblemsWarning
#endif
#if ! firenze
CrontabSuspiciousWarning
#endif
#if moscow
MailmanAddressesProblemsWarning
MailmanLocalAddressesProblemsWarning
#endif
#if piktmaster
YPPasswdCrosscheckWarning
// RemoteDiskCapWarning
// RemoteDiskInodeWarning
RemoteLpDisabledWarning
#endif
#if ! linux
LpDisabledWarning
#endif
#if mailserver
# if moscow
ArchiveMailFileWarning
# endif
# if ! kiev
MailQuotaWarning
# endif
#endif
#if ! ( moscow | berlin )
DirsUserStatWarning
# if homedirlinksys
RemoveHomeDirLinkWarning
// HomeDirLinksBrokenWarning
HomeDirNonLinksWarning
# endif
#endif
#if warsaw
PiktWebFileCtimeChangeWarning
#endif
#if mus | perf | comp
RemoveOrphanedSASWorkDirsWarning
#endif
#if piktmaster
PiktcFilesChecksumWarning
PiktcFilesDiffWarning
PiktcChecksumWarning
PiktcDiffWarning
#endif
PiktFileProblemsWarning
LogUpdatesWarning
FileUpdatesWarning // put this last
///////////////////////////////////////////////////////////////////////////////
Notice // things that deserve our attention but may or may
// not require action; if we are busy, we can usually
// safely ignore most or all notice alerts (simply
// delete them)
#ifndef test
timing 50 3 * * * 10
// timing 50 3 * * 1-6 // never on a sunday!
#elsedef
timing 50 8-16/2 * * *
#endifdef // test
nice 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Notice' =pikt-notice"
alarms
DiskCapNotice
DeleteCoreFilesNotice
ClearTmpNotice
FileSystemSizeChangeNotice
MessagesScanNotice
LogSystemTruncateNotice
LogPiktTruncateNotice
LogLocalTruncateNotice
#if solaris
RemoveCrashFilesNotice
RemoveOrphanedPrintFilesNotice
#endif
#if ! no_usr_local
FilesSystemBackupNotice
#endif
#if mailserver
MailArcFileProblemsNotice
#endif
#if nismaster
YPPasswdRetireAcctNotice
#endif
#if ! linux
NumberedPacctFileNotice
NumberedSyslogFileNotice
EmptyNetscapeCacheNotice
#endif
#if vienna | mus | ( perf & ! paris7 )
AutoRebootCronNotice
#endif
MailFileProblemsNotice
#if vienna
MailChkErrorsNotice // for now, only gathers data,
// actual checking turned off;
// data required for =sortlist,
// run nightly via Warning on
// moscow
#endif
#if warsaw
DMailWebClearNotice
#endif
///////////////////////////////////////////////////////////////////////////////
Info // for informational purposes only, or for
// occasional housekeeping tasks; we usually just
// glance at info alerts or ignore them
// altogether (simply delete them)
#ifndef test
# if moscow
timing 50 0 * * 1,3,5 10
# else
timing 50 0 * * 1 10 // run to completion before the
// start of Warning
# endif
#elsedef
timing 50 8-16/2 * * *
#endifdef // test
priority 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Info' =pikt-info"
alarms
SystemFileStatChangeInfo
DiskInfo
#if mus | perf | comp
FindMultiMediaFilesInfo
#endif
#if ! piktmaster
CleanPiktDirsInfo
#endif
DiskHogsInfo
///////////////////////////////////////////////////////////////////////////////
/*
Security // these usually deserve nearly immediate attention;
// there's security-related stuff scattered
// throughout other alerts; this is just another
// place to put such stuff; we should probably
// replace 'Security' with a different name, or have
// a family of security alerts (see below)
timing
// 0-55/5 0-5 * * *
20% 0-5 * * * // on average, every 5 minutes
// if some things are added to this that don't
// need to run every five minutes, we can control
// their scheduling by means of the appropriate
// time checks within the individual alarms,
// 'quit'ing and moving on to the next alarm
// if the time is not right
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Security' =pikt-security"
lpcmd "=lp =piktprinter"
alarms
// for now, just this, but we have security stuff
// scattered throughout the rest of the configuration
AfterHoursUserActivitySecurity
// the following is for future development; for an explanation, please see
// the sample defines.cfg, also the Security Considerations section of the
// PIKT Reference and the distribution README
//
// alarms
//#ifdef attentive
// SecurityAlarm1A // must-do, day-to-day stuff,
// SecurityAlarm1B // and/or low cost to run
// ...
//# ifdef cautious
// SecurityAlarm2A
// SecurityAlarm2B
// ...
//# ifdef worried
// SecurityAlarm3A
// SecurityAlarm3B
// ...
//# ifdef paranoid
// SecurityAlarm4A // resource-intensive and/or
// SecurityAlarm4B // time-consuming stuff
// ...
//# endifdef
//# endifdef
//# endifdef
//#endifdef
//
// use per-machine and per-os '#if' statements to fine tune as necessary;
// the overal idea here is: tend to do more, and do more often, resource-
// intensive things on the more powerful machines, and tend to be more
// paranoid on the mission-critical machines.
*/
///////////////////////////////////////////////////////////////////////////////
#if piktmaster
Admin // for scripts aiding in the administration of
// the PIKT system
timing 10 2 * * *
drift 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Admin' =pikt-admin"
scripts
MakeFilesPiktObjectsAdmin
IncludeFilesBackupAdmin
UpdateIncludeFilesAdmin
IncludeFilesSizeChangeAdmin
InstallDynamicConfigFiles
InstallPiktMasterObjectsAdmin // should be run
// at the last
#endif // piktmaster
#if moscow
Admin
timing 45 0 * * * // should be run before Warning alert
// on moscow
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Admin' =pikt-admin"
scripts
FindAllUsersAdmin
FindMailmanAddressesAdmin
FindAuthUsersAdmin
#endif // moscow
///////////////////////////////////////////////////////////////////////////////
Debug // for PIKT self-monitoring; these deserve
// fairly close attention, especially on the
// piktmaster, where we not only run more often,
// we also cron it
#ifndef test
# if piktmaster
// crond runs Debug at alternating intervals like so:
// 55 1,3,5,7,9,11,13,15,17,19,21,23 * * *
/usr/bin/nice -10 /pikt/bin/pikt +M
// "/usr/bin/mailx -s 'PIKT Alert on vienna: Debug'
brahms\@hamburg" +A Debug
timing 55 0-22/2 * * *
# else
timing 55 2-22/4 * * *
# endif
#elsedef
timing 55 8-16/2 * * *
#endifdef // test
nicecmd "=nice -10"
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Debug' =pikt-debug"
alarms
PiktHeartbeatDebug
#if piktmaster
// it's important to run PiktDaemonProblemsCritical
// independently in two separate alerts, Critical
// and (our choice) Debug; if you just run it in one
// alert, and that alert hangs, then you miss this
// vital alarm; we recommend, too, that you run the
// Debug alert via cron; in addition to the above
// schedule, where 'pikt +A Debug' is invoked by
// piktd, we also have cron invoke Debug (from our
// root crontab):
//
// 55 1,3,5,7,9,11,13,15,17,19,21,23 * * *
/usr/bin/nice -10 /pikt/bin/pikt +M
// "/usr/bin/mailx -s 'PIKT Alert on vienna:
Debug' brahms\@hamburg" +A Debug
//
// so, we run PiktDaemonProblemsCritical independently
// under three different schedules:
//
// 30 * * * * [in the Critical alert,
// invoked by piktd]
// 55 0-22/2 * * * [in the Debug alert,
// invoked by piktd]
// 55 1,3,5,7,9,11,13,15,17,19,21,23 * * *
// [in the Debug alert,
// invoked by cron]
PiktDaemonProblemsCritical
#endif
PiktStatusChkDebug
PiktFileNotExistDebug
FileAgesPiktDebug
PersistentPiktRunDebug
PiktcSvcLogScanDebug
PiktdLogScanDebug
PiktEmergencyLogScanDebug
PiktUrgentLogScanDebug
PiktCriticalLogScanDebug
PiktWarningLogScanDebug
PiktNoticeLogScanDebug
PiktInfoLogScanDebug
PiktSecurityLogScanDebug
#if piktmaster
PiktcLogScanDebug
#endif
#if piktmaster | moscow
PiktAdminLogScanDebug
#endif
PiktDebugLogScanDebug
///////////////////////////////////////////////////////////////////////////////
#if backupclient
Backup // backup stuff
#ifndef test
timing 45 2 * * 3,5 10 // wed,fri
// sun-tue, thu, sat, no run
#elsedef
timing 45 8-16/2 * * *
#endifdef // test
nice 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Backup' =pikt-backup"
alarms
=backup_alarms
#endif // backupclient
///////////////////////////////////////////////////////////////////////////////
/*
Red // deals with extraordinary situations
timing 0-50/10 * * * *
drift 2
priority 5
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Red' =pikt-emergency"
alarms
DeadManActivityEmergency
*/
///////////////////////////////////////////////////////////////////////////////
#ifdef test
//# if piktmaster
Test // use this for testing newly developed alarm scripts;
// install with 'piktc -iv +D test +A Test +H ...'
// or maybe 'piktc -iv +D test debug verbose
-D page doexec +A Test +H ...'
// after testing, remove all traces of
// the Test alert with 'piktc -tv +A Test +H ...'
// timing =piktnever
timing 5-50/15 * * * *
mailcmd "=mailx -s 'PIKT Alert on =pikthostname:
Test' =pikt-test"
alarms
//ChecksumDifferenceCritical
//RaidStatusNotOptimalEmergency
CPUUsageUrgent
//# endif // piktmaster
#endifdef // test
///////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////
#if moscow
BakMail // do nightly backup of /var/mail
timing 40 23 * * *
priority 10
execcmd "=date >> =logdir/BakMailRpt.log; =prgdir/bakmail.pl -R 1
2>&1 >> =logdir/BakMailRpt.log; =date >>
=logdir/BakMailRpt.log"
#endif
///////////////////////////////////////////////////////////////////////////////
#if nosendmail & ! linux
ProcessMailQueue
# if misscritsys
timing 5-50/15 * * * *
# else
timing 40 * * * *
# endif
drift 3
execcmd "=sendmail -q"
#endif
///////////////////////////////////////////////////////////////////////////////
[For more examples, see Samples.]
Home |
FAQ |
News |
Intro |
Samples |
Tutorial |
Reference |
Software |
Authors |
Licensing |
SiteSearch
Links |
SiteIndex |
Pikt-Users |
Pikt-Workers |
Contribute |
ContactUs |
Top of Page
Page best viewed at 1024x768.
Page last updated 2005-06-22.
This site is
PIKT®
powered.
PIKT® is a registered trademark of the University of Chicago.
Copyright © 1998-2005 Robert Osterlund. All rights reserved.
|