Scanning a Log File
Another common PIKT task is to routinely scan log files and report items of concern or interest.
For example, SuLogScanEmergency is a Pikt script to scan the sulog for especially worrisome events, such as suspicious or unexpected su-to-root.
SuLogScanEmergency
init
status active
level emergency
task "Scan the sulog for su-to-root by users other
than sysadmins, or authorized system owners"
input logfile "=sulog"
dat $date 2
dat $time 3
dat $result 4
dat $port 5
dat $users 6
begin
// assume no crisis (yet)
set #crisis = #false()
rule // find $user and $newuser
set #i = #index($users, "-")
set $user = $substr($users,1,#i-1)
set $newuser = $substr($users,#i+1)
rule // su-to-root success
if $newuser eq "root"
#ifndef paranoid
&& $user !~ "^(root|=sysadmins|mahler)$"
# if db
&& $user !~ "^(=dbadmins)$"
# endif
&& $user !~ "^(=sysowner)$"
#endifdef // paranoid
&& $result eq "+"
set $msg = "SU-TO-ROOT SUCCESS: $inlin"
output mail $msg
output log "=sulogscan_log" $msg
set #crisis = #true()
endif
end
#ifdef page
# if misscritsys
if #crisis
=page($hostname() sulog event (see alert email),
=pagesysadmins, =allhours(#now()))
endif
# endif // misscritsys
#endifdef // page
For SuLogScanEmergency, our input is the sulog file (here specified as the =sulog macro, which might resolve to /var/adm/sulog). Every time SuLogScanEmergency runs, and because input was specified as type 'logfile', only new entries are presented. If there are no new log entries, the script has effectively nothing to do. (By contrast, if 'input file' were specified, the entire file would always be looked at.)
sulog has six fields, only five of which are asigned to data variables as shown.
At the outset, we initialize a variable (#crisis), a common thing to do in a Pikt script begin section.
In the first rule (for the first new log entry and all subsequent log entries), we employ several built-in Pikt functions to determine the su before-and-after users.
In the next rule, for attempted su-to-root, we might additionally consider:
It should be clarified that these questions, because they involve preprocessor directives, are answered at the time of script installation. The SuLogScanEmergency script on the db systems would have the =dbadmins test; the other systems would not. Whether or not we consider sysadmin and owner su's would depend on if we are in paranoid mode--at the instant we install the script. The end-result, post-preprocessing script on the slave systems--the script as actually run--would have more or fewer lines depending on the system and how we have specified our PIKT defines when we install the script.
Regardless of the situation on any particular machine, back to the basic question: For any su-to-root entries not ignored, was this a successful su? If so, send alert e-mail, also log the entry in a special script log file. Additionally, set the #crisis flag to true.
Now,
-
if we have activated paging (by perhaps setting page to TRUE in the piktmaster defines.cfg, or maybe by specifying 'piktc -iv +D page ...' in our command to install the script), and
-
if this is a so-called (in systems.cfg) "misscritsys" (mission-critical system)
add an end section to the script on that system; otherwise, leave the end section out.
For mission-critical systems with paging turned on, if #crisis is true (because of a successful, unauthorized su-to-root), page the system administrators at any time of the day or night. The =page() macro is defined (in macros.cfg) as:
page(M, R, H) // send a page message (M) to recipients (pager phone
// alias) (R) but only during hours (H)
// sample use: =page($host is sick/down,
// =pagesysadmins,
// =allhours(#now()))
if (H)
=execwait "echo '(M)' | =mailx -s '(M)' (R)"
fi
=pagesysadmins is defined as a list of e-mail addresses for calling the sysadmins' pagers.
=allhours() is defined as
allhours(H) // any time of the day or night
#true()
A related macro, =offhours(), might be defined as
offhours(H) // between 10 PM and 6 AM
((H) >= 22 || (H) < 6)
If we only wanted to page the sysadmins at times when they might actually respond to the page (i.e., they're not off duty, asleep, ...), we might use
=page($hostname() sulog event (see alert email),
=pagesysadmins, ! =offhours(#now()))
Whew! That's a tangle of #if's, #ifdef's, macros, macros-within-macros, and ordinary script logic. You could easily get carried away with all of this and wind up with obscure, unreadable code. But if you are careful with your config file layout and use of macros, you have the power to precisely customize your script behavior to the system and situation. Keep it simple if you like, or make it as complex as conditions demand.
Here is a sample alert message for this script:
PIKT ALERT
Fri Nov 28 19:06:50 2003
kiev
EMERGENCY:
SuLogScanEmergency
Scan the sulog for su-to-root by users other than sysadmins,
or authorized system owners
SU-TO-ROOT SUCCESS: SU 11/28 18:32 + pts/2 arthing-root
You might also, if you wish, implement a companion script, say SuLogScanWarning, to report su-to-root and su-to-other failures. There are many possibilities in scanning the sulog. With PIKT and its tremendous preprocessing power, the possibilities in scanning all the many system logs in precise and sophisticated ways are almost endless.
|