NetSaint Service (NSServ) for Windows NT4/2000

Programmed by Jan Christian Kaldestad (ai97jck@hib.no) and Hallstein Lohne (hallstein@yahoo.com)


History (feel free to release new versions of this software):

12 june 2000 - Release 2 (by Jan Christian Kaldestad and Hallstein Lohne):
- Bug fix: Eventlog-page was not properly HTML - META-expired, thus the old html-file was loaded from the client's cache, leaving the admin with an incorrect, out of date eventlog.
- Bug fix - we forgot to remove some testing-cout's in check_ntservice
- Forgot to include NT sources: eap and nsservcfg
- Programmed some make files for Linux.
- Minor changes to this html-documentation.

10 may 2000 - First release (by Jan Christian Kaldestad and Hallstein Lohne).


Firstly, these programs might have BUGS. Lot's of it. We are NOT supporting this application in any way, as we are finished programming this project. Bugs can be reported to the netsaint-email-userlist IF some other programmer continues fixing bugs. If the service crashes, just restart it. It's not easy to program a stable Win32 application. Do what you want with this code, as long as you follow the GNU Public License. If you can read Norwegian and would like to receive our full report in Word format where we explain this project in about 50 pages, write an email to Hallstein. 

This release must use the PDH.DLL (Windows Performance Data Helper DLL) taken from the installation of Windows 2000 Pro from Microsoft Corporation, this is required for the monitoring of CPU-usage. If you do not find it in this release, you can get it from a Windows 2000 installation (DLL Version 5.0.2174.1).

There might be errors in this documentation, as well as spelling errors ;-)

Finally, be sure to visit www.netsaint.org for the latest release of this software.


Some of the stuff in this release will probably be included into Ethan Galstad's NTStat Service, as this code is very similar.

The NetSaint environment

For all you folks which are new to this NetSaint - thing, here's a picture which describes the NetSaint environment:

(You may use this .gif as long as you give credit.) As you can see, NetSaint only provide an engine, and the plug-ins are the only ones which can do the actual checking-work. Well, read more at the NetSaint homepage.


NT NetSaint Service Features:

Index
Monitor CPU usage - UDP
Eventlog checking - TCP
Service checking - TCP

All plug-ins reads the password from the file /netsaint-path/etc/ntpasswd


  • Monitor CPU usage

    check_ntload <IPadress> <port> <interval1> <wload1> <cload1>
    … <interval2> <wload2> <cload2>


    interval1, interval2 period in minutes that we wish to calculate average cpuload for. Max value is 15 minutes.
    wload1, wload2 cpu-limit that result in a warning.
    cload1, cload2 cpu-limit that result in a critical state.

    Note: This uses the UDP protocol.

    Example: check_ntload 192.168.0.2 1050 1 50 80 10 50 80
    Host Service Status Last Updated Attempt Service Information
    Dilbert
    CPUload CRITICAL Mon Feb 21 18:43:59 IST 2000 1/3 CPU Load HIGH (1m:86%)(10m: 46%) Process CPUSTRESS using 74%

 

  • Monitor Eventlog

    We have chosen to make an own configuration file where you chose your parameters. The file has the name /netsaint-path/etc/eventlog.conf

    Example:
    url=http://netsaintsrv/netsaint/eventlog.html
    logsize=512
    [DILBERT]
    ip=192.168.0.1
    port=1050
    period=10
    applicationlog=1
    systemlog=1
    securitylog=0
    auditFailure=0
    auditSuccess=0
    error=10/20
    warning=10/20
    information=0

    First line, URL for the generated html-file. This file will spend its life in /usr/local/netsaint/share, so we can now write http://netsaintsrv/netsaint/eventlog.html.

    Logsize - maximum filesize (kByte)

    Ip/port - you know this (remember to open firewall port or whatever ;-)  )

    Period - How long (in minutes) ago we want to check the eventlog

    Then you choose what logfile(s) you want to monitor. 1=TRUE 0=FALSE

    Then we choose how many events an application can generate before we give NetSaint a warning or critical state. Firstly, we choose the warning limit, then (slash '/' ) the critical limit. If 0 is chosen, we do not want to monitor this type. You can choose from these types: auditFailure, auditSuccess, error, warning, information.

    check_nteventlog DILBERT

    The plug-in will open eventlog.conf and check for the machine-name
    Result:
    Host Service Status Last Updated Attempt Service Information
    Dilbert
    Eventlog CRITICAL Mon Feb 21 18:43:59 IST 2000 1/3 More info here.

    When you click on the 'here' - link, a page will pop up with this kind of info:

    Eventlog messages

    ******** 09.05.2000 15:35 **********
    Computer:	DILBERT
    Period:		600 minutes
    Status:		error
    Source:		BROWSER
    Event id:	8015
    Event type:	information
    Message count:	2
    Message:	The browser has forced and election on network
    		\Device\Nof_RTL80291 because a Windows NT Server
    		(or domain master) browser is started.
    Source:		Print
    Event id:	13
    Event type:	information
    Message count:	6
    Message:	Document 7, \\CATBERT\NETSAINT\C-PROGS\EVEN owned by
    		Administrator was deleted on OKIPAGE 4w plus.
    Source:		Print
    Event id:	10
    Event type:	information
    Message:	Document 10, Microsoft Word - oving4.doc owned by
    		Administrator was printed on OKIPAGE 4w plus via
    		port LPT1:.  Size in bytes: 244615; pages printed: 1
    Source:		Application Popup
    Event id:	26
    Event type:	information
    Message count:	2
    Message:	Application popup: System Process - Out of Virtual Memory
    		 : Your system is running low on virtual memory. Please
    		close some applications. You can then start the System
    		option in the Control Panel and choose the Virtual Memory
    		button to create an additional paging file or to increase
    		the size of your current paging file.
    ... (et cetera) ...
    

 

  • Monitor services

    check_ntservice <IP-adresse> <port> <showall/showfail> <ServiceName1> [servicename2]
           ...      [”Service Name3”] [Servicename4] ...

    Showall - displays all services, even if they are in running state (OK)
    Showfail - display only failed services, which are not in running state.
    ServiceName1 etc... You just specify the service names as they are shown in Services on your NT-computer (be sure to type the name correctly). You can probably monitor a lot of services, but remember that NetSaint will only display about 120 characters, so be sure what to monitor! It is probably not interesting to monitor "NetSaint Service" (our application) because if it is not running, you will get no answer at all.

    Return codes:
    One or all services... NetSaint gets this state
    ...are running OK
    ...does not exist WARNING
    ...is paused WARNING
    ...is not started WARNING
    ...too long respond time CRITICAL
    ...returned an error CRITICAL

Example: check_ntservice 192.168.1.51 1050 showall "NetSaint Service" Diskeeper
Host Service Status Last Updated Attempt Service Information
Dilbert
CheckService WARNING Mon Feb 21 18:43:59 IST 2000 1/3 OK:NetSaint Service NotExist: Diskeeper


NT NSServ User Manual

We did program a little program called nsservcfg which has a GUI for installing the NT-service. This program edits the nsserv.ini-file which spend its life in your windows-dir (%SystemRoot%). It can also remove the service. You can also install the service using nsserv -install. Format of nsserv.ini:

key=secret
port=6000
dir=E:\NSserv

The password is not encrypted.
key - password
port - port for using the NSServ
dir - the dir where NSserv application resides. We have not tested very long path's so it might not work if you use C:\PROGRAM FILES\NSSERV

Be sure that the dir includes all these files:

nsserv.exe - the service application
eap.exe
- program for counting processes on the system
pdh.dll - cpuload uses this DLL.
(nsservcfg.exe - not necessary, but a nice GUI)

Files included in this release

NT Stuff folder:
eap.exe - used by NSServ
"Generate Some Events.exe" - used for testing. It simply generates a lot of events.
nsserv.exe - the NetSaint Service service application
NSServcfg.exe - the GUI
pdh.dll - used by NSServ
source-folder  - the source for compiling NSServ in Microsoft Visual Studio 6.0

Linux folder:
netsaint-folder  - contains compiled files for i486. Compiled using Red Hat 6.1/6.2

Linux Stuff\netsaint\etc - contains example configs for eventlog and sample ntpassword. Additionally, we have included a myhosts.cfg-file for showing how we configured eventlog.

source-folder   - contains the source for the files.

Linux: Compile using:
 g++ -c source-name util.cc
 g++ -o program_name source-name.o util.o 

Why both UDP and TCP?

As the project increased, we went from UDP to TCP. The event-log delivers too much data for UDP-programming. And from here we implemented TCP into both check-services and eventlog plug-ins. Be sure to open your ports for both TCP and UDP if you would like to use this package.

Overview: NetSaint Service for NT

This one is for all you programmers ;-)

Messages sent over the network
We send a string of this type:
id=n;key=XXX;parameter1;parameter2;;

Note that we finish off with ';;'

CPU-load

SEND STRING: id=X;key=X;i1=X;i2=X
id message type
key password
i1 interval in minutes which we wish to get calculated (max 15min)
i2 interval in minutes...

RECEIVE STRING: id=X;l1=X;l2=X;pname=X;pload=X

id message type
l1 average cpuload which corresponds to period i1
l2 average cpuload which corresponds to period i2
pname - name of the process which currently uses the most cpu
pload cpu-load for this process

This part is a thread. Gets CPU-values every second and which process that uses the most CPUload. Note: The PDH.DLL gets the processes for us. The problem is that processes started after we have opened NSServ was not be displayed, because we did not manage to get PDH.DLL to rescan the processlist. We had to program eap.exe to fix this.

So, we open eap.exe (enumerate all processes) and open a named pipe between our NSServ and eap.

Event-log checking

SEND STRING: id=X;key=X;per=X;app=X;sys=X;sec=X;wAudF=X;eAudF=X; wAudS=X;eAudS=X;wErr=X;eErr=X;wWarn=X;eWarn=X;wInf=X;eInf=X

id - message type
key - password
per - tidsrom i minutter som vi ønsker kontrollere Event-loggen over
app - Log applications log (0=false, 1=true)
sys - Systemlog...
sec - Securitylog...
wAudF -  ‘Failure audit’ events limit for an application. Result in WARNING. (If nonzero (value is not '0') )
eAudF - ‘Failure audit’ events limit. Result in CRITICAL. (If nonzero)
wAudS - ‘Success audit’ events limit. Result in WARNING. (If nonzero)  
eAudS - ‘Success audit’ events limit. Result in CRITICAL. (If nonzero)
wErr - ‘Error’ events limimt. WARNING (if nonzero)
eErr - ‘Error’ limit. CRITICAL (If nonzero)
wWarn -  ‘Warning’ events limit. Result in WARNING state (if nonzero).
eWarn -  ‘Warning’ events limit. Result in CRITICAL state (if nonzero).
wInf - ‘Information’ events limit. Result in WARNING (if nonzero).
eInf - ‘Information’ events limit. Result in CRITICAL (if nonzero)

This was probably the hardest part to code, as we have to get the length of a event-string and we had to use dynamic memory allocation a lot. We have tried to find all memory leaks, but there can still be some left.

Eventtypestype Description
Information  Operation success
Warning  Problems which normally is not important.
Error  Problems which you should take care of.
Success audit Access which were successfully audited
Success failure Audit which failed

RECEIVE STRING: nbytes=X;cname=X;stat=X;src=X;id=X;type=X;cnt=X;msg=X;;

If status is OK, we send: Status=OK.

nbytes - how many bytes the message is.
cname - Computer name
stat - Status for Event-log (based on limits from client)
src - Name of source application that has broken a limit
id - id for the event
type - Event type
cnt - Number of events of this Event ID for this application.
msg - Text which describes this event

Services check

SEND STRING: id=X;key=X;service1;service2;service name3; ...  ;serviceN;;

id - message type
key - password
service - service name ...

RECEIVE STRING: service1=status;service2=status; ... ;serviceN=status;;

service - name of servicen which is controlled.
status - status code (integer) which describes service status.

We have defined 6 status codes:
1 - Service do not exist.
2 - Service is stopped
3 - Service did not respond to our request. (It "hangs".)
4 - Error while checking service. (Internal error. E.g. we did not get connect to SCM-server).
5 - Service is paused
6 - Service is started (and running) -> OK.

We would like to say thanks to:

  • Haukeland Hospital, Norway: Kim Johnny Mathisen and Jan - Eirik Olsen for their ideas. Their effort made this project possible. Thank you very much for the computer equipment we borrowed for Hallstein's 486.
  • MSDN (Microsoft Developer Network CD's).
  • Lars Edgar Berg, teacher at Bergen College, for his effort in teaching us WIN32API and his book which makes a nice entrance to WIN32API programming.
  • Ethan Galstad for his excellent NetSaint and other cool plug-ins.
  • Microsoft, for pressing hardware-prices down (because all offices has to buy new hardware almost every year).
  • And all you other guys at the internet who continues to deliver bug-reports and contribute to the GNU open source scene.

Remember - the rich is getting richer - the poor is getting poorer. Vote for politicians who think that everybody has the same rights and same potential.

Signing off,
Hallstein Lohne and Jan Christian Kaldestad                Bergen, Norway - May 2000