Doing Something Substantial

Now it's time to do something a little more substantial, and somewhat more complicated.

(Note:  For the rest of this Tutorial, we use Solaris as our example operating system.  Make adjustments to your own situation as necessary.)

First, edit the alerts.cfg file, removing a couple of lines and uncommenting others.  After the edits, your alerts.cfg file should look like this:

///////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////

Critical                // things that should be dealt with before too long,
                        // preferably by day's end; (things reported here
                        // may not be especially "critical" but are so
                        // designated to conform with syslog's log levels)

        timing          20 * * * *
        mailcmd         "=mailx -s 'PIKT Alert on =pikthostname:
                                            Critical' =piktadmin"
        alarms
                        LoadAverageCritical
                        ProcCountTotalCritical
                        RootCoreFileExistCritical
                        PasswdFileProblemsCritical
                        MailQueueLengthyCritical
                        MessagesScanCritical
                        // DiskFullCritical

///////////////////////////////////////////////////////////////////////////

Debug                   // for PIKT self-monitoring; these deserve
                        // fairly close attention, especially on the
                        // piktmaster

        timing          55 * * * *
        mailcmd         "=mailx -s 'PIKT Alert on =pikthostname:
                                            Debug' =piktadmin"

        alarms
                        PiktCriticalLogScanDebug

///////////////////////////////////////////////////////////////////////////

#ifdef test

//#  if piktmaster

Test                    // use for testing newly developed alarm scripts;
                        // install with 'piktc -iv +D test +A Test +H ...'
                        // or maybe 'piktc -iv +D test debug verbose
                                             -D page doexec +A Test +H ...'
                        // after testing, remove all traces of
                        // the Test alert with 'piktc -tv +A Test +H ...'

        timing          =piktnever
        mailcmd         "=mailx -s 'PIKT Alert on =pikthostname:
                                            Test' =piktadmin"
        alarms
                        DiskFullCritical

//#  endif  // piktmaster

#endifdef  // test

///////////////////////////////////////////////////////////////////////////

The alarm scripts
                        LoadAverageCritical
                        ProcCountTotalCritical
                        RootCoreFileExistCritical
                        PasswdFileProblemsCritical
                        MailQueueLengthyCritical
                        MessagesScanCritical

are a simple set of generic scripts that should work as is on just about any system.  If any are inappropriate for your system or cause you any difficulty, simply comment them out in the same way that the DiskFullCritical script is commented out:
                        // DiskFullCritical

The one Debug alarm script
                        PiktCriticalLogScanDebug

scans the Critical.log file for any signs of trouble.  It, too, is generic and should work as is on any system.

If you did your edits correctly, everything should check okay:

/pikt/bin/piktc -cv +H mysystem

checking mysystem...

Restart the piktc_svc (if you had killed it at the end of the first chapter of this Tutorial, Getting Started):
/pikt/bin/piktc_svc

Next, install all alerts with:
/pikt/bin/piktc -iv +A all +H mysystem

processing mysystem...
installing file(s)...
Debug.alt installed
Critical.alt installed

This will install the alerts in the /pikt/lib/alerts directory:
ls /pikt/lib/alerts
Critical.alt  Debug.alt

If you inspect either of these .alt files, the scripts therein will look much like the versions in alarms.cfg with the following differences:
  • all // and /* */ comments are stripped
  • all macros are expanded (e.g., =ps -> /usr/bin/ps)
  • there are perhaps some minor layout differences
Before we schedule these alerts for automatic execution, let's run them at the command line:
/pikt/bin/pikt +A Critical
User nobody4 has nonexistent shell 
the size of /etc/passwd has changed by >= 10%, was -2 lines, is now 12
Sep 22 09:36:57 hissystem sshd[8955]: [ID 363151 daemon.notice] log:
  ROOT LOGIN as 'root' from hersystem.uppity.edu
...

You may or may not get any output, depending on your setup (e.g., the contents of your passwd and messages files).  If pikt complains about not finding the messages (or any other) file, edit the macros.cfg file and substitute the appropriate path.

Now try running the Debug scripts:

/pikt/bin/pikt +A Debug
Sep 22 14:26:59 vienna pikt[1528]: [ID 1 INFO] [WARNING] in dorules(),
  MailQueueLengthyCritical, no input data
sh: /usr/bin/uptime: not found
Sep 22 14:29:44 vienna pikt[1528]: [ID 1 INFO] [WARNING] in dorules(),
  LoadAverageCritical, no input data
Sep 22 14:29:47 vienna pikt[1528]: [ID 1 INFO] [WARNING] in dorules(),
  MailQueueLengthyCritical, no input data
Sep 22 14:29:48 vienna pikt[1528]: [ID 1 INFO] [WARNING] in dorules(),
  MessagesScanCritical, no input data

Don't worry about the "no input data".  They may be perfectly normal and expected for your situation.  Look in the configs_samples to see how to filter out such alert messages.

If you see one or more error messages like

sh: /usr/bin/uptime: not found

this is a sign that one or more of the command paths defined in macros.cfg are in error.  If so, please fix them now.

If you are working through this Tutorial guide from the beginning, you still have the file /pikt/etc/piktd.conf.  If that file exists, view its contents now.

cat /pikt/etc/piktd.conf
* * * * * 0 /pikt/bin/pikt +M "/usr/bin/mailx -s 'PIKT Alert on mysystem:
                    Critical' root" +A Critical

Looks much like a normal Unix crontab file, doesn't it?

This file was installed by your earlier '/pikt/bin/piktc -erv +A Critical +H mysystem' command.

Now, instead do

/pikt/bin/piktc -ev +A all +H mysystem

processing mysystem...
enabling alert(s)...
Debug enabled
Critical enabled

Inspect anew the piktd.conf file:
cat /pikt/etc/piktd.conf
55 * * * * 0 /pikt/bin/pikt +M "/usr/bin/mailx -s 'PIKT Alert on mysystem:
                     Debug' root" +A Debug
20 * * * * 0 /pikt/bin/pikt +M "/usr/bin/mailx -s 'PIKT Alert on mysystem:
                     Critical' root" +A Critical

You have overwritten the previous piktd.conf with the latest specifications in your alerts.cfg file.  There are now two lines, one for every alert listed in alerts.cfg, because you specified 'piktc -ev +A all', whereas earlier you specified '+A Critical' only.  (Why doesn't the Test alert appear? It's because that alert is wrapped within an '#ifdef test ... #endifdef'.  Since test is set to FALSE in defines.cfg, the preprocessor bypasses that alert.  More on this later.)

So, 'piktc -e' enables or adds updated piktd.conf entries.  ('piktc -d' disables or removes alert entries entirely.)  Whenever you want to change your alert schedules, you would edit alerts.cfg then re-enable the affected alerts using 'piktc -e'.

Verify that no piktd is running:

ps -ef | grep pikt | grep -v grep
    root  9771     1  0   Sep 22 ?        0:00 /pikt/bin/piktc_svc

Then (re)start the piktd with:
/pikt/bin/piktc -rv +H mysystem

processing mysystem...
(re)starting daemon (piktd)...
daemon (re)started

Verify that piktd is now running:
mysystem:506> ps -ef | grep pikt | grep -v grep
    root  9771     1  0   Sep 22 ?        0:00 /pikt/bin/piktc_svc
    root 10521     1  0 18:46:35 ?        0:00 /pikt/bin/piktd

Your half dozen or so Critical alarm scripts are now on duty, on the lookout for the indicated signs of trouble.  You can check their status with
/pikt/bin/piktc -sv +A all +H mysystem

processing mysystem...
showing alert stata...
Critical active
Debug active

If the process count were to spike, you (actually, the root account, or whatever you have defined as =piktadmin in macros.cfg) would receive email like
                                PIKT ALERT
                         Wed Sep 26 17:20:36 2001
                                 mysystem

CRITICAL:
    ProcCountTotalCritical
        Report perilously high overall system process count

        The process count is 66

(Actually, in order to have it run as is on any system, we simply feed 'ps' to the 'wc -l' in ProcCountTotalCritical.  A simple 'ps' will not count every process on a system.  You would normally use something like 'ps -aux' or 'ps -ef' instead.  That's an exercise left to you, the reader.)

After a time, your log files should show signs of life:

ls -l /pikt/var/log
total 62
-rw-------   1 root     other        7308 Sep 26 17:59 Critical.log
-rw-------   1 root     other        1289 Sep 26 17:55 Debug.log
-rw-------   1 root     other         924 Sep 26 16:20 MessagesScan.log
-rw-------   1 root     other         516 Sep 26 17:46 piktc.log
-rw-------   1 root     other        1684 Sep 26 17:46 piktc_svc.log
-rw-------   1 root     other         902 Sep 26 17:55 piktd.log

Inspect these if you wish.  With PIKT, many different things are logged.  Get in the habit of referring to the log files if--no, when!--something goes wrong.

prev page 1st page next page
Home | FAQ | News | Intro | Samples | Tutorial | Reference | Software
DevNotes | Licensing | Authors | Pikt-Users | Pikt-Workers | Links | Site Index | Contact Us
Page best viewed at 1024x768 or greater.   Page last updated 2007-08-06.   This site is PIKT® powered.
PIKT® is a registered trademark of the University of Chicago.   Copyright © 1998-2007 Robert Osterlund. All rights reserved.
Home FAQ News Intro Samples Tutorial Reference Software
PIKT Logo
PIKT Page Title