If you are setting up Nagios from scratch, install the NSClient++ agent on your Windows servers and get the increased flexibility that it offers. My predecessor at my current work place has only installed the NRPE addon (the same guy who installed the core datacentre router with a duplex mismatch….that made my first week fun), which means I can’t use much of the cool check_nt stuff to monitor services and processes :o(
I needed a way to tell if a service had stopped on Windows server, but I could only use NRPE. First stop, a script to check the status of a given service.
strComputer = "."
'list services to monitor, comma seperated, inside quotes
arrServices = Array("Awesome Service")
For each strService in arrServices
'connect using standard monkier
Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
'get an array containing all services
Set objItems = objWMIService.ExecQuery ("Select * from Win32_Service")
'for each service compare it’s display name to the current one we are looking for
For each objService in ObjItems
'if we get a service display name match
If objService.DisplayName = strService Then
'display the current service along with it’s current state
'wscript.echo "service name = " & objService.DisplayName & " currently :: " & objService.State
If objService.State = "Running" Then
'If the service is running say so
Wscript.Echo "SERVICE running"
Else
'otherwise it must not be runing
Wscript.Echo "SERVICE not running"
End if
End if
Next
Next
This script binds to WMI, searches for a service called Awesome Service and then echoes a statement to say if it’s running or not. Perfect, but Nagios can’t use this quite yet. We need the script to send some data back to the NRPE engine for this to work.
The Nagios plug-in dev guide tells you most of what you need to know, in this case we need to pass return codes back, which is covered here.
So the finished version now looks like this
strComputer = "."
'list services to monitor, comma seperated, inside quotes
arrServices = Array("Awesome Service")
For each strService in arrServices
'connect using standard monkier
Set objWMIService = GetObject("winmgmts:" & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
'get an array containing all services
Set objItems = objWMIService.ExecQuery ("Select * from Win32_Service")
'for each service compare it’s display name to the current one we are looking for
For each objService in ObjItems
'if we get a service display name match
If objService.DisplayName = strService Then
'display the current service along with it’s current state
'wscript.echo "service name = " & objService.DisplayName & " currently :: " & objService.State
If objService.State = "Running" Then
'If the service is running return exit code 0 = ok
Wscript.Echo "SERVICE STATUS: OK"
Wscript.Quit(0)
Else
'otherwise return non 0 = error = fire alert hopefully
Wscript.Echo "SERVICE STATUS: Critical"
Wscript.Quit(2)
End if
End if
Next
Next
So if the service is running, we exit with return code 0 Wscript.Quit(0). But if it’s not, we exit with a non 0 return code. I need an alert to fire an SMS, so I have used Wscript.Quit(2) for critical, but if you only want a warning you can use Wscript.Quit(1).
Save the file in the NRPE scripts location (mine are located at C:\Program Files\NRPE_NT\libexec\
Final piece of the puzzle is to add the actual command to run the script to the NRPE config file. Mine is located at ‘C:\Program Files\NRPE_NT\bin\nrpe.cfg’, but your may vary.
At the end of the file are a list of demo commands, we just need to add in
command[check_awesome_service]=cscript.exe //T:30 //NoLogo "C:\Program Files\NRPE_NT\libexec\check_awesome_service.vbs"
Now add a command definition to the Nagios commands.cfg
# 'check_awesome_service' command definition (using nrpe)
define command{
command_name check_galaxy_service
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 60 -p 5666 -c check_awesome_service
}
And finally in my Nagios services.cfg file an service definition that includes the command and the hosts to run this against
define service{
host_name windows_server_1
service_description Windows Awesome Service
servicegroups cust-windows
check_command check_awesome_service
use generic-service
}
And that should be it. You need to restart Nagios to include the new commands and service definitions. And then test the monitor by stopping and the starting the service in question.
The next step would be to replace the service name in the .vbs script file with a variable. Then you can reuse the same script to monitor different services by passing the service name from Nagios to NRPE as a variable from the config file. :oD
