Nagios

Monitoring Windows Machines


Up To: Contents
See Also: Quickstart Installation Guide, Monitoring Publicly Available Services

Introduction

This document describes how you can monitor "private" services and attributes of Windows machines, such as:

Publicly available services that are provided by Windows machines (HTTP, FTP, POP3, etc.) can be monitored easily by following the documentation on monitoring publicly available services.

Note: These instructions assume that you've installed Nagios according to the quickstart guide. The sample configuration entries below reference objects that are defined in the sample commands.cfg and localhost.cfg config files.

Installing the Windows Agent

Before you can begin monitoring private services and attributes of Windows machines, you'll need to install an agent on those machines. I recommend using the NSClient++ addon, which can be found at http://sourceforge.net/projects/nscplus. These instructions will take you through a basic installation of the NSClient++ addon, as well as the configuration of Nagios for monitoring the Windows machine.

1. Download the latest stable version of the NSClient++ addon from http://sourceforge.net/projects/nscplus

2. Unzip the NSClient++ files into a new C:\NSClient++ directory

3. Open a command prompt and change to the C:\NSClient++ directory

4. Register the NSClient++ system service with the following command:

	nsclient++ /install

5. Install the NSClient++ systray with the following command:

	nsclient++ SysTray

6. Open the services manager and make sure the NSClientpp service is allowed to interact with the desktop (see the 'Log On' tab of the services manager). If it isn't already allowed to interact with the desktop, check the box to allow it to.

7. Edit the NSC.INI file (located in the C:\NSClient++ directory) and uncomment the allowed_hosts option. Add the IP address of the Nagios server to this line, or leave it blank to allow all hosts to connect.

8. Start the NSClient++ service with the following command:

	nsclient++ /start

9. If installed properly, a new icon should appear in your system tray. It will be a yellow circle with a black 'M' inside.

10. Success! The Windows server can now be added to the Nagios monitoring configuration...

Nagios Host Configuration

You'll need to create some object definitions in your Nagios configuration files in order to monitor a new Windows machine. These definitions can be placed in their own file or added to an already exiting object configuration file.

First, its best practice to create a new template for each different type of host you'll be monitoring. Let's create a new template for Windows server.

define host{
	name			windows-server	; The name of this host template
	use			generic-host	; Inherit default values from the generic-host template
	check_period		24x7		; By default, Windows servers are monitored round the clock
	check_interval		5		; Actively check the server every 5 minutes
	retry_interval		1		; Schedule host check retries at 1 minute intervals
	max_check_attempts		10		; Check each server 10 times (max)
	check_command		check-host-alive	; Default command to check if servers are "alive"
	notification_period	24x7		; Send notification out at any time - day or night
	notification_interval	30		; Resend notifications every 30 minutes
	notification_options	d,r		; Only send notifications for specific host states
	contact_groups		admins		; Notifications get sent to the admins by default
	register			0		; DONT REGISTER THIS - ITS JUST A TEMPLATE
	}

Notice that the Windows server template definition is inheriting default values from the generic-host template, which is defined in the sample localhost.cfg file.

Next, define a new host for the Windows machine that references the newly created windows-server host template.

define host{
	use		windows-server	; Inherit default values from a template
	host_name		winserver	; The name we're giving to this host
	alias		My Windows Server	; A longer name associated with the host
	address		192.168.1.2	; IP address of the host
	hostgroups	allhosts		; Host groups this server is associated with
	}

Add an optional hostgroup for Windows servers. This is useful if you create additional servers in the future and want to view them together in the CGIs. It can also be useful for object definition tricks that you can use to manage larger configurations later on.

define hostgroup{
	hostgroup_name	windows-servers		; The name of the hostgroup
	alias		Windows Servers	; Long name of the group
	members		winserver		; Comma separated list of hosts that belong to this group
	}

The winserver host will be a member of two hostgroups - allhosts (which is referenced in the host definition and defined in localhost.cfg) and windows-servers (which is defined above).

Monitoring Services

Now that the NSCLient++ addon has been installed on the Windows machine and you've configured a host definition for the machine in Nagios, you can addon some service definitions for things you want to monitor. All of the service examples I'll cover use the check_nt plugin to talk to the NSClient++ addon on the Windows machine. The check_nt plugin is included in the Nagios plugins distribution and a command definition for using the plugin has been defined in the commands.cfg file. It looks like this:

define command{
	command_name	check_nt
	command_line	$USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
	}

Now let's go over some example service definitions for monitoring different aspects of the Windows machine...

Monitoring NSCLient++ Version

The following service definition will allow you to monitor the version of the NSClient++ addon that is running on the Windows server. This is useful when it comes time to upgrade your Windows servers to a newer version of the addon.

define service{
	use			generic-service
	host_name			winserver
	service_description	NSClient++ Version
	check_command		check_nt!CLIENTVERSION
	}

Monitoring Uptime

The following service definition will allow you to monitor the uptime of the Windows server.

define service{
	use			generic-service
	host_name			winserver
	service_description	Uptime
	check_command		check_nt!UPTIME
	}

Monitoring CPU Load

The following service definition will monitor the CPU utilization on the Windows server and generate a CRITICAL alert if the 5-minute CPU load is 90% or more or a WARNING alert if the 5-minute load is 80% or greater.

define service{
	use			generic-service
	host_name			winserver
	service_description	CPU Load
	check_command		check_nt!CPULOAD!-l 5,80,90
	}

Monitoring Memory Usage

The following service definition will monitor memory usage on the Windows server and generate a CRITICAL alert if memory usage is 90% or more or a WARNING alert if memory usage is 80% or greater.

define service{
	use			generic-service
	host_name			winserver
	service_description	Memory Usage
	check_command		check_nt!MEMUSE!-w 80 -c 90
	}

Monitoring Disk Usage

The following service definition will monitor usage of the C:\ drive on the Windows server and generate a CRITICAL alert if disk usage is 90% or more or a WARNING alert if disk usage is 80% or greater.

define service{
	use			generic-service
	host_name			winserver
	service_description	C:\ Drive Space
	check_command		check_nt!USEDDISKSPACE!-l c -w 80 -c 90
	}

Monitoring A Windows Service

The following service definition will monitoring the W3SVC service state on the Windows machine and generate a CRITICAL alert if the service is stopped.

define service{
	use			generic-service
	host_name			winserver
	service_description	W3SVC
	check_command		check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
	}

Monitoring A Windows Process

The following service definition will monitoring the Explorer.exe process on the Windows machine and generate a CRITICAL alert if the process is not running.

define service{
	use			generic-service
	host_name			winserver
	service_description	Explorer
	check_command		check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
	}