make Nagios web interface read only…..

March 5th, 2010

Even though we’re not a massive company (less than 50 butts on seats) we do have quite a bit of kit in an environment that is growing ever more complex.

To help we use Nagios to monitor key systems and services and to alert us via email when issues arise (and hopefully we can correct them before the masses notice)

My boss decided he wanted to share our Nagios screens with others (well, his boss) and so I installed a workstation with x2 flat screens lofted up on high so they could be seen from a distance.

But, I had a slight snag. We use authentication on Nagios and the account used for viewing the web console had enough permissions to be able to execute the host commands listed on the right hand side of interface (shown below)

This meant that should any passer by wish to, they could click the url link to say, turn off a check that was failing (not that any of our users would do such a thing !). So I needed a way to make the web interface either not display those links or be read-only for those links, essentially prevent people from altering the configuration.

Peeking through the config files for Nagios, it seems my predecessor had the same idea at some point, but had not quite managed to pull it off. Inside the cgi.cfg file (which was located at /usr/local/nagios/etc/cgi.cfg) are the following lines

default_user_name=

authorized_for_system_information=

authorized_for_configuration_information=

authorized_for_system_commands=

authorized_for_all_services=

authorized_for_all_hosts=

authorized_for_all_service_commands=

authorized_for_all_host_commands=

The ones of interest are :

authorized_for_all_services=

authorized_for_all_hosts=

By adding a user to these x2 lines *only*, the urls on the pages for running commands and viewing/modifying the config do not work and give a permissions error

You will also need to add the name you add to those x2 line to the /usr/local/nagios/etc/htpasswd file as well

Now, even though you can still see the command urls on the pages, you get this if you try to click them

nagios says no

So, how far had my predecessor gotten ? Well, something I take for granted that I guess he did not know, the list of names supplied should be comma seperated with no space between them

Easy when you know how :o)

Microsoft UK tech.days……yipee !!

February 26th, 2010

I just signed up for a couple of the Microsoft UK tech.days events being held in London during April.

Am really hoping to get to see Chris Jackson live as I have only previously seen him online on Channel 9 and on the 2008 TechEd recordings (2nd page, first row, far right video). He really seems to know his stuff and have a sense of humour and presentation charisma.

Is quite a big deal for me as I don’t think I will ever work in the sort of company who send their staff out to the big official MS TechEd events held globally (at least I haven’t been sent to any so far, and never having been to one, have not been able to ask the attendees what sort of company they are working for that send them to MS Tech events).

In fact, I think the last formal IT training I was sent on was over 10 years ago when I was sent on a course to learn Exchange server 5.5 administration.

Admittedly I seem to have done ok without any training, getting by using books, online examples and demos and so on. But some systems (ones from Microsoft in particular) are getting so large and so complex with so many features and capabilities built right into them that I wonder if I am doing some things inefficiently or even incorrectly.

Take desktop deployment. My current employer are using Windows Vista. One of the earlier tasks I did (after the massive mail migration I wrote about on here previously) was to replace the mix of XP and Vista will a few standardised builds of Vista using WDS. The learning process was pretty steep, and very confusing.

I could not get the answer files to work correctly for unattended installs, I gave up on trying to figure the Microsoft Deployment Toolkit (MDT). In the end I simply installed a box *eaxctly* how I wanted it to be, and then sysprep’d it with an answer file. I then used ImageX to capture the system to a .WIM file and this is what I used to deploy to new systems. Even though it works pretty well (the only bits I could not automate were the machine naming, joining the domain and Windows activation) I am still not sure I am doing it the way Microsoft intended.

Now in 2010, the office here are looking to replace Vista with Windows 7 (not just to be fashionable you understand, but there do seem to be too many issues with Vista for our liking). I downloaded with Windows Automated Installer Kit (WAIK) for Windows 7 to have a look, and it bigger and even more complex that the one for Vista was.

So I have high hopes that some bright Microsoft chappie (maybe even Chris Jackson himself) will take to the stage and say “here’s how you do it” and show me the bits I’m missing, and the bits I’m doing wrong. I am taking my laptop and will be furiously trying to record everything they say and do :oO

p.s. If you work for a company that sends you to tech events (not just the MS ones) please let me know who you are and what you do, cause I really wanna go to them too :o/

http://www.microsoft.com/uk/techdays/dayitp.aspx

sorry, I want a swimming pool you see ?…….

February 19th, 2010

I came across a random blog by Erik Van Slyke today while skimming the www.wordpress.com website. I was actually looking into the story about the 2 hour outage they had (which seems to have been routing based and nothing much they could have done about it, their systems were fine, the paths into and out from them were knacked).

The blog post Erik had written was about layoffs, or as we know them in the UK, redundancies. I have been fortunate in my life so far to have never been made redundant from a company, or been out of work for any period of time. But, I had always been a little confused as to why companies let people go (unless they were really small companies and keeping someone on meant going bust and closing shop !).

Erik recounts a story whereby an exec level team have a meeting and annouce that the company is doing ok, but to kepp 100% of their bonuses, they need to reduce headcount by 5% !! Sack people doing their jobs perfectly well in order to futher line their pockets.

Are these people wrong ? Are they simply ambitious ? Are they just greedy ?

I wonder if I ever got to that level of corporate structure and was asked to do the same to protect my own bonus and those of other staff if I could go along with it ? Can’t really see it myself, I tend to have apathy towards those less fortunate than myself. I even feel guilty every time I pass a homeless person that I have a job, a home and a reasonably regular life.

I have to say I found the article both enlightening and depressing, especially in the current situation with so many people out of work everywhere. I will now worry that, no matter the quality and volume of work I produce for any given employer, I could be axed through no fault of my own at any time just so someone higher up the corporate ladder can have a swimming pool installed at their home.

family Clark…….

February 17th, 2010

Big congratulations to Matt and Tiki for adding Mia Edith and Dylan William to the family Clark.

Well done to you both, look forward to coming to see you all very soon hopefully ;o)

Awwwww....bless

my career broken down…….

February 12th, 2010

Not quite sure who created this, but it pretty much sums what I do from day to day.

In fact, this process had led to the creation of some of the posts here on this very site :o)

Thanks XKCD

IIS7 AppPool user account causes HTTP 503 error

February 9th, 2010

I don’t profess to be any kind of IIS expert, in fact, I would say I’m more of an Apache man myself. I just find it easier dealing with flat text file for application configs, frankly while I’m sure there are benefits to having the IIS config all sorted in metadata and stuff, I just find it confusing and overwhelming, gimme httpd.conf any day.

While trying to configure an IIS7 AppPool to use a not evelvated logon to run as, I recieved a HTTP 503 error and the following was logged in Appilcation area of the event viewer.

The identity of application pool user.www.somedomain.com is invalid. The user name or password that is specified for the identity may be incorrect, or the user may not have batch logon rights. If the identity is not corrected, the application pool will be disabled when the application pool receives its first request. If batch logon rights are causing the problem, the identity in the IIS configuration store must be changed after rights have been granted before Windows Process Activation Service (WAS) can retry the logon. If the identity remains invalid after the first request for the application pool is processed, the application pool will be disabled. The data field contains the error number.

Quite a few possibilities mentioned there, so I started with the first one, incorrect user. I deleted the user logon, recreated it, set the password and then re-configured the IIS AppPool to use the newly created account. But still the page gave me a 503 error.

So I looked at the new possibility, ‘Batch Logon Rights’. Comparing the local security policy MMC for the server I was having trouble with and one that was working ok I found that the group ‘IIS_IUSRS’ had been granted the ‘Logon As Batch’ right on the standalone server, but not on the server that was part of a domain ?!

Local Security Policy MMC

As the domained server was controlled by group policies I could not just add the group directly to the permission, I had to create a group policy to grant ‘IIS_IUSRS’ the ‘Logon As Batch’ right and the run a ‘gpupdate /force’ on the domain server.

Restarting IIS and testing the site again showed everything now working correctly. It seems that the ‘IUSR_USRS’ group gets granted the ‘Logon As Batch’ right automatically on standalone servers, but not ones that are part of a domain, you have to grant the rights by adding them via a group policy.

Sonos does not work with Avast anti virus

January 30th, 2010

Popped over to my friends Bob and Michael to take a look at their home setup and try to figure out why their Sonos system would not work. My Sonos systems at home has always just worked out of the box. it worked when the music library was on a PC, it worked when I moved the music off onto a NAS, it just plain works, so I could not think why it was not working for them.

The first Sonos unit was connected to the PC via ethenet connection, so it wan’t some weird wi-fi issue. When I launched the Sonos application and tried to add a library, it thought about it for approx. 6-10 seconds, and then came back with an error saying the server could not be reached !?

I turned off the Windows firewall temporarily to see if it was the culprit, no joy. I then disabled their anti virus product, which in this case happend to be Avast (will try to find exact version they had installed), and the Sonos burst into life and began to index their music library.

I replaced Avast with Clamwin, reactivated the Windows fire wall and everything is still working :o)

I can also confirm Sonos works ok with AVG and Symantec anti virus products. I will try downloading the latest version of Avast and see if the issue is still there.

odd windows DNS issue…….

January 21st, 2010

Hmmmm, something is up with DNS at work. Randomly (anything from a week to 2 months) it seems to stop resolving .co.uk for some domains (especially www.bbc.co.uk) ? Nothing recorded in the eventlog for the times while it is behaving like this. Restarting DNS server fixes the problem for a while until it breaks again.

I recently patched server 2008 to SP2 as I found some issues that were fixed in that SP (like incomplete zone transfers which broke some stuff a while back).

But the service pack does not seem to have fixed this random sulking occuring in DNS.

For now I have enabled DNS debugging to a file on the system and restarted DNS, now I will need to patiently wait for it to act up again so I can have a peek and see if anything looks amiss.

I can find nothing solid on google either. If I ever get to the bottom of it I’ll re-post here, but in the mean time if anyone has any ideas let me know as I am stumped.

reluctant MCSE……

January 16th, 2010

Yep…..guess it probably about time to get my Microsoft certifications in order. My current ones are either :

a) Valid but horribly out of date
b) Lapsed completely

Well, the last MS exam I sat was back when NT4 was considered all the range !

So I’ll be procuring a box set of the core essentials books from MS press and spending a lot more nights at home.

Would also be cool to find a study group based in London (if such a thing exists, a quick skim of the first few pages of Google yielded nothing).

Will probably post progress and notes here as I go along (in the hope that it could help others).

apache2: no listening sockets available…….

January 8th, 2010

Following on from the issue(s) I had with my OpenVPN server, I was still not happy/confident that in the event of a reboot or restart for any reason (wether deliberate or unintentional) all the necessary processes and services would startup successfully without some post boot intervention.

This in mind, I decided to create another server to transfer the live service(s) onto so I could get some much needed downtime on the existing server. Owing to the lack of another physical machine to do this with, I decided to create an virtual machine on our ESX cluster.

The initial steps were pretty easy, create a VM with x1 Vcpu, 1GB RAM, 30GB vdisk and x2 network interfaces. I installed Ubuntu server 9.04 i386 from the .iso and enabled LAMP and SSH. Installation completed and the system rebooted. Watching the console I saw that everything started at bootup time as it should.

Next step was to copy the websites across from the live server to this one. I installed NFS and mounted /var/www from the live server and copied all the sites across along with the relevant config files. I modified the config files to allow for the change of ip address and then restarted the system.

And that was when it started to go wrong. I only caught a glimpse of the error the first time I restarted the system. After reboot, I logged in a checked and apache was not running. Looking in /var/log/syslog did not show any clues why, even the error message itself did not seem to have been captured.

So I rebooted again and watched the console carefully, and this time saw the error :

apache2: no listening sockets available

along with

could not bind to address x.x.x.x:80 (where x was the ip address of the server)

Googling this made mention several times of other processes or programs perhaps using and blocking the socket/port in question, but this was happening at boot time, nothing else really had a chance to be up and running yet ? to test, I tried starting apache from the command prompt after bootup and it started fine, so what was going on

The main difference between this server and the live one was that this one was in a VM. Looking at the runlevel start scripts I noticed apache gets in there really early with S02apache2. Given my previous post where OpenVPN was trying to start before bridging on the live server, I wondered if perhaps the interface that Apache was trying to bind to was perhaps not quite ready at the time it tried during the boot process.

So I moved S02apache2 to S09apache2 for all runlevels and rebooted the VM again. Result, Apache was now loading as part of the boot process with no errors or manual intervention required.

So if you are also having issues with processes that do not start at boot time, but start fine after boot when you initiate them from the command prompt, you may just need to move them to a little late in the boot process to give other things time to start up beforehand.

I don’t profess to be the best system admin in the world, but I always get to the cause eventually :o)