My faithful office OpenVPN server required a reboot before the start of the Christmas holidays to install some updates and patches.
The server came back ok and seemed to be fine, so I thought nothing much of it and went home for a few days off……until the emails started arriving from users stating they could not connect to the vpn from their homes !
So Boxing day I trudged through the freezing cold to the office to logon to the box locally to find out what was going on (was obviously something big as I could not connect in either).
Initial findings were that the OpenVPN process did not seem to be running….? so I issued ‘/etc/init.d/openvpn start’ and it started fine. so, what caused it to stop running ? peeking into /var/log/messages.log I found the following lines
TCP/UDP: Socket bind failed on local address x.x.x.x:1194: Cannot assign requested address
Googling this error revealed a few other people had also had this issue, but there was nothing definitive as to the cause.
Was another process grabbing port 1194 and preventing openvpn from starting up ? I decided to reboot the server to check, and there it was again, the openvpn process failed to start with the same error message, but nothing else was using port 1194 when I checked, and when I started openvpn manually after reboot it came up fine, what was going on ?
Going back over the installation steps I took to install and setup openvpn, I remembered that it requires the use of the bridge-utils app for bridging the ethernet interfaces on the server. I wondered if there was some kind of race condition happening whereby birdge-utils had not started in time for openvpn to bind to the virtual tap interface that gets created.
So I stopped openvpn with ‘/etc/init.d/openvpn stop’ an then stopped bridging using ‘/etc/openvpn/scripts/bridge-stop’
I then tried to start openvpn without bridge-utils running and got the same error that I was seeing in the syslog when I rebooted the system. So that was the problem, but how to fix ?
First off I need to check which run levels openvpn and bridge-utils were being loaded at. ‘checkconfig -l | grep -E “openvpn|bridge”‘ showed both loading at runlevels 2,3,4 and 5.
Looking into the run level 5 in /etc/rc5.d I could see the x2 scripts used for starting up these processes at boot time, S01openvpn and S06bridge-start. As the startup scripts execute in numerical order, openvpn was being started before bridge-start. Simply moving S01openvpn to S10openvpn was all that was required. A subsequent reboot of the server showed that the openvpn process was already running when I logged on to the server post boot.
then the trek back home again in the freezing cold :o(