Archive for the ‘nagios’ Category

Using Nagios NRPE To Monitor Windows User Accounts Via WMI…….

Wednesday, February 8th, 2012

My last post on the subject of using the Nagios NRPE plugin to monitor stuff on Windows. This time I want to be monitor some Window domain user accounts for lockout status.

We run some of our windows services using ordinary domain user accounts instead of the built in local service accounts. This is normally when the service in question needs to read or write to network shares on another server, the built in service accounts don’t seem to pass and username/password along with the request and so the connection fails. When you run the process under a domain account, the name and password are passed along with the connection request and as long as you have set your share and NFS permissions correctly it should work.

We recently had an application stop working and it turned out to be that the domain user account that the service ran under had become locked out.

So I decided it would be a good idea to get a heads up about this sort of thing sooner rather than later. The script needs to run on your AD domain controllers (x1 is probably fine as they all replicate user data, but more might allow detection a little quicker).

The code for the script is below.

' account name supplied as argument
strAccount = Wscript.Arguments.Item(0)

' bind to the MERCURY domain
Set objComputer = GetObject("WinNT://MYDOMAIN")

objComputer.Filter = Array("User")
' for each service compare it’s display name to the current one we are looking for
For each objUser in objComputer
    If objUser.Name = strAccount then
    	If ObjUser.IsAccountLocked <> 0 then
    		Wscript.echo "Account is locked out"
    		Wscript.Quit (2)
    	Else
    		Wscript.echo "Account is ok"
    		Wscript.Quit (0)
    	End if
    End if
Next

The script is using WMI to check the IsAccountLocked value of a user object. The user object has quite a lot of key pair values that you can monitor, basically all the fields and check boxes you see in the AD user dialogue box.

In this instance, I am only interested in the ‘Account is locked out’ check box. If it’s checked the value will be something other than 0 (1 in this case)and I want an alert, but if the value is still 0 then it’s not checked and the account is ok.

As with the prior checks I wrote about, the .vbs script file needs to be dropped into the ‘libexec’ folder, and a line like below needs to be added to the nrpe.cfg config file on the windows server.

command[check_windows_account]=cscript.exe //T:30 //NoLogo "C:\Program Files (x86)\NRPE_NT\libexec\check_windows_account.vbs" "$ARG1$"

On the Nagios server you need to add a command definition to the commands.cfg file.

# 'check_windows_account' command definition (using nrpe)
define command{
        command_name    check_windows_account
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_windows_account -a $ARG1$
}

And finally a service check has to be created in the services.cfg file and a server to be checked needs to be added (in this example I’m checking the account ‘AppUser’ on the server Windows_Server_1.

define service{
        service_description     Check Windows User Account
        servicegroups           cust-windows
        host_name               windows_server_1
        check_command           check_windows_account!"AppUser"
        use                     generic-service
}

Hopefully these scripts have given you an idea of what you can do with the NRPE plugin. As long as you can write a script to check a known value of something, you can get Nagios to use it as a monitor and fire an alert. And it doesn’t have to be VBScript, Powershell, Perl, Python they all can be used. You can monitor WMI objects, or Windows Perfmon Counters, the list is vast.

Enjoy ;oD

Using Nagios NRPE To Monitor Windows Processes Via WMI…….

Tuesday, February 7th, 2012

Hot on the heels (well ok, Oct last year to be precise) of Using Nagios NRPE To Monitor Windows Services Via WMI comes Using Nagios NRPE To Monitor Windows Processes Via WMI.

Naturally, after providing work with a Nagios check to tell us when a named Windows service was not in a running state, the next question from their mouths was going to be


“can Nagios tell us when a program that is not a service has stopped running ?”

Well yes it can, and it’s not difficult. Instead of looking at the list of services, we need to read the list of processes running (like the list you can see in Task Manager) and look to see if our one is included in the list.

Task Manager Process List

The code to do this is below.

strComputer = "."
'list services to monitor, comma seperated, inside quotes
strProcess = Wscript.Arguments.Item(0)
'connect using standard monkier
Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
'search for our process
Set colProcesses = objWMIService.ExecQuery ("Select * from Win32_Process Where Name = '" & strProcess & "'")
If colProcesses.Count > 0 Then
'If the process is running return exit code 0 = ok
	Wscript.Echo "SERVICE STATUS: OK"
	Wscript.Quit(0)
Else
	'otherwise return non 0 = error = fire alert hopefully
	Wscript.Echo "SERVICE STATUS: Critical"
	Wscript.Quit(2)
End If

Again, the walk through of this is quite simple. We call the program with x1 parameter/argument, a string containing the name of the process we are looking for (NOTE: this needs to be the full name including the ‘.exe’ or ‘.com’ and if it contains space characters the whole thing should be enclosed in quote marks).

We then bind to WMI and run a query to find our process in the list of running processes. The last part simply counts how many results were returned. If the count is larger than 0, the process must be running and we return a status code 0 for success, if not, we return a status code of 2 and Nagios fires a critical alert (if you prefer a warning rather than a critical then adjust to return a 1)

Note: The string search is not case sensitive, ‘OUTLOOK.EXE’ and ‘outlook.exe’ equate the same regardless of how the process actually looks in the list.

As with the previous check, the .vbs script file needs to be dropped into the ‘libexec’ folder, and a line like below needs to be added to the nrpe.cfg config file on the windows server.

command[check_windows_process]=cscript.exe //T:30 //NoLogo "C:\Program Files (x86)\NRPE_NT\libexec\check_windows_process.vbs" "$ARG1$"

On the Nagios server you need to add a command definition to the commands.cfg file.

# 'check_windows_process' command definition (using nrpe)
define command{
        command_name    check_windows_process
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_windows_process -a $ARG1$
}

And finally a service check has to be created in the services.cfg file and a server to be checked needs to be added (in this example I’m checking for ‘outlook.exe’ on the server Windows_Server_1.

define service{
        service_description     Check Windows Process
        servicegroups           cust-windows
        host_name               windows_server_1
        check_command           check_windows_service!"outlook.exe"
        use                     generic-service
}

The current script simply alerts if a process is found to be missing from the task list, but it could be modified. You could have it fire a warning if the number of processes is more than 5 but less than 10, and a critical is the number drops below 5 for example.

Using Nagios NRPE To Monitor Windows Services Via WMI Part 2…….

Friday, September 30th, 2011

Have realised my first attempt at using NRPE to monitor Windows services via WMI is in fact badly thought out and badly done. This is what happens when companies want everything yesterday and rush things :o(

Having thought about it, the following has come to mind:

The service string to check should not be hard coded into the script. Otherwise we would need x1 script per service to check (i.e. lots !). The service string should be a variable that we can pass to the script as an argument at run time.

And, we can only check one service at a time with this script. Therefore, placing the service name into an array is whaaaayyy overkill. Will simply replace the array with a single string variable.

This in mind, here’s the revised version of the check script

strComputer = "."
'list services to monitor, comma seperated, inside quotes
strService = Wscript.Arguments.Item(0)
'connect using standard monkier
Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
'get an array containing all services
Set objItems = objWMIService.ExecQuery ("Select * from Win32_Service")
'for each service compare it’s display name to the current one we are looking for
For each objService in ObjItems
	'if we get a service display name match
	If objService.DisplayName = strService Then
		'display the current service along with it’s current state
		'wscript.echo "service name = " & objService.DisplayName & " currently :: " & objService.State
		If objService.State = "Running" Then
		'If the service is running return exit code 0 = ok
			Wscript.Echo "SERVICE STATUS: OK"
			Wscript.Quit(0)
		Else
		'otherwise return non 0 = error = fire alert hopefully
			Wscript.Echo "SERVICE STATUS: Critical"
			Wscript.Quit(2)
		End if
	End if
Next

And the command to add to the nrpe.cfg file will now need a parameter adding to the end like so (note the quote marks “” around the $ARG1$ parameter. This is in case our variable has spaces in it !!).

command[check_windows_service]=cscript.exe //T:30 //NoLogo "C:\Program Files (x86)\NRPE_NT\libexec\check_windows_service.vbs" "$ARG1$"

The command.cfg file will need a command definition in it like this

# 'check_windows_service' command definition (using NRPE)
define command{
	command_name	check_windows_service
	command_line	$USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_windows_service -a $ARG1$
}

And finally, in services.cfg, a service check section using the command, like this

define service{
        service_description     Check Windows Awesome Service
        servicegroups           cust-windows
        host_name               windows_server_1
        check_command           check_windows_service!"Some Windows Service"
        use                     generic-service
}

But we can now use the same script to check other services like this

define service{
        service_description     Check Windows Awesome Service
        servicegroups           cust-windows
        host_name               windows_server_1
        check_command           check_windows_service!"Some Windows Service"
        use                     generic-service
}

define service{
        service_description     Check Windows Spooler Service
        servicegroups           cust-windows
        host_name               windows_server_1
        check_command           check_windows_service!"Print Spooler"
        use                     generic-service
}

Second time’s a charm. At least I got to go back and correct my horrible (but technically working) mistake !

Next stop, monitoring for running processes by their executable name in the process list…….

doh !

Using Nagios NRPE To Monitor Windows Services Via WMI…….

Wednesday, September 28th, 2011

If you are setting up Nagios from scratch, install the NSClient++ agent on your Windows servers and get the increased flexibility that it offers. My predecessor at my current work place has only installed the NRPE addon (the same guy who installed the core datacentre router with a duplex mismatch….that made my first week fun), which means I can’t use much of the cool check_nt stuff to monitor services and processes :o(

I needed a way to tell if a service had stopped on Windows server, but I could only use NRPE. First stop, a script to check the status of a given service.


strComputer = "."
'list services to monitor, comma seperated, inside quotes
arrServices = Array("Awesome Service")
For each strService in arrServices
	'connect using standard monkier
	Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
	'get an array containing all services
	Set objItems = objWMIService.ExecQuery ("Select * from Win32_Service")
	'for each service compare it’s display name to the current one we are looking for
	For each objService in ObjItems
		'if we get a service display name match
		If objService.DisplayName = strService Then
			'display the current service along with it’s current state
			'wscript.echo "service name = " & objService.DisplayName & " currently :: " & objService.State
			If objService.State = "Running" Then
			'If the service is running say so
				Wscript.Echo "SERVICE running"
			Else
			'otherwise it must not be runing
				Wscript.Echo "SERVICE not running"
			End if
		End if
	Next
Next

This script binds to WMI, searches for a service called Awesome Service and then echoes a statement to say if it’s running or not. Perfect, but Nagios can’t use this quite yet. We need the script to send some data back to the NRPE engine for this to work.

The Nagios plug-in dev guide tells you most of what you need to know, in this case we need to pass return codes back, which is covered here.

So the finished version now looks like this


strComputer = "."
'list services to monitor, comma seperated, inside quotes
arrServices = Array("Awesome Service")
For each strService in arrServices
	'connect using standard monkier
	Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
	'get an array containing all services
	Set objItems = objWMIService.ExecQuery ("Select * from Win32_Service")
	'for each service compare it’s display name to the current one we are looking for
	For each objService in ObjItems
		'if we get a service display name match
		If objService.DisplayName = strService Then
			'display the current service along with it’s current state
			'wscript.echo "service name = " & objService.DisplayName & " currently :: " & objService.State
			If objService.State = "Running" Then
			'If the service is running return exit code 0 = ok
				Wscript.Echo "SERVICE STATUS: OK"
				Wscript.Quit(0)
			Else
			'otherwise return non 0 = error = fire alert hopefully
				Wscript.Echo "SERVICE STATUS: Critical"
				Wscript.Quit(2)
			End if
		End if
	Next
Next

So if the service is running, we exit with return code 0 Wscript.Quit(0). But if it’s not, we exit with a non 0 return code. I need an alert to fire an SMS, so I have used Wscript.Quit(2) for critical, but if you only want a warning you can use Wscript.Quit(1).

Save the file in the NRPE scripts location (mine are located at C:\Program Files\NRPE_NT\libexec\

Final piece of the puzzle is to add the actual command to run the script to the NRPE config file. Mine is located at ‘C:\Program Files\NRPE_NT\bin\nrpe.cfg’, but your may vary.

At the end of the file are a list of demo commands, we just need to add in


command[check_awesome_service]=cscript.exe //T:30 //NoLogo "C:\Program Files\NRPE_NT\libexec\check_awesome_service.vbs"

Now add a command definition to the Nagios commands.cfg


# 'check_awesome_service' command definition (using nrpe)
define command{
        command_name    check_galaxy_service
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_awesome_service
        }

And finally in my Nagios services.cfg file an service definition that includes the command and the hosts to run this against


define service{
        host_name               windows_server_1
        service_description     Windows Awesome Service
        servicegroups           cust-windows
        check_command           check_awesome_service
        use                     generic-service
}

And that should be it. You need to restart Nagios to include the new commands and service definitions. And then test the monitor by stopping and the starting the service in question.

The next step would be to replace the service name in the .vbs script file with a variable. Then you can reuse the same script to monitor different services by passing the service name from Nagios to NRPE as a variable from the config file. :oD