Server Downtime

The server was down today from around 8:30 to 01:00 CET due to a boot manager without a default choice. Actual downtime was roughly 10 minutes, so it’s a bit of a bummer not to have detected this any sooner. However, I was moving to a new flat and didn’t get connected again until this evening.. sorry for any inconvenience this may have caused.

Note that the server will be going down again August 25th from 12:00 to 16:00 CET, as that is when the DSL line is installed and switched over to the new place.

Router Upgrade

The router has been fairly unstable over the last 5 days. It’s firmware has just been updated and everything reconfigured from scratch, but only time will tell if this has solved the problem. According to my ISP the old firmware did have a number of bugs in it – I just think it’s strange that these haven’t cropped up before, but there you go.

On another note, the site will be going down for maintenance right about now, but should be back up again within 15 minutes or so. I need to physically move the server, and thus have to disconnect it.

Gentoo Trouble

The server has been unavailable to most users from guesstimately yesterday evening until 13:30 (CET) today. Services were suddenly being denied access to various system locations (such as /tmp and /dev/null, which are rather vital).

While the exact cause is still being investigated, the most likely culprit is Gentoos package management software (portage). Whether it’s a general bug or "just" an error in one of the package ebuilds I do not know, but I’ll certainly try to find out.

I guess this means it’s time to go for a proper ACL system and Tripwire (or some other integrity checking tool). 

Software Upgrades ;-)

Due to the server now having two separate internet connections I had to set up policy based routing, in order to make the server respond on the same interface as the request was received.

I took the opportunity to clean up my firewall rules, and the combination of these changes took a while to get right. This may have caused problems for some users from 4:00-6:00 (CET) this morning.

Hardware Upgrades

The server will be taken offline saturday (April 2nd) afternoon (~16:00 CET) for a few hardware upgrades. I expect it will be unavailable for anywhere between 20 minutes and a full hour, depending on how smoothly everything goes.

Security Upgrades

The server has been upgraded to Gentoo Hardended using Linux 2.6.11. All security measures (except ACLs) are in use.

Additionally, Apache now uses mod_security to sanitize requests, filter referer spam, and other niceties.

These changes may have broken some existing functionality. If you spot anything not working as before, please leave a comment or send us an email.

Blog software upgrade

The software used to run the mertner.com blogs – WordPress – has been upgraded from an arbitrary daily build of v1.3 to the newly released v1.5. Getting the upgrade to work required changes in quite a few places, and chances are that something somewhere does not work quite right as a result of the upgrade.

If you find something of this kind, please send an email or add a comment to this post. (Assuming comment posting works, of course. If not, please yell loudly until we hear you or find the problem ourselves.)

Additional Hiccups

We had another power outage yesterday at 22:13, causing a sudden reboot. It was repeated this morning at 4:14, and this time the server decided to halt with a BIOS message, leaving it unavailable until 13:01. Sigh!

Apologies for any inconvenience caused by these incidents.

Server Downtime

A router was fried on sunday (20/3 at 9:30) in connection with a power outage, and could not be replaced until today. This caused mertner.com to be unavailable for a record period of roughly 50 hours, completely ruining any hopes of a nice uptime this year. The current uptime is around 99.2%, which is decent enough for a DSL-hosted site. But, it could be better ;-)