Sun Nov 14 17:44:36 EST 2010

further adventures in server administration

Recently I've made some changes to bbot.org's backend that you almost certainly didn't notice unless you were paying exceptionally close attention, but they were fiddly and tedious to do, so I'm going to flip them into a blog post.

I converted the front page (and some of the subpages) to HTML 5, which required abandoning the table-based layout, and switching everything over to CSS. This was surprisingly easy, due to investing the absolute minimum amount of time possible into it. This resulted in a layout that mostly works in firefox/chrome et al, but breaks amusingly in internet explorer. If I was doing this for money, I would have to actually care about cross-browser compatibility, but seeing as how I'm not, I made an executive decision to not give a shit.

The bblog also declares itself to be HTML 5, but fails validation whenever I embed a youtube video, use named anchors and, hilariously, link to any URL with an ampersand in it.

A change you may have actually noticed: I got tired of how the body text in permlink entires spanned the width of the screen, which, when your monitor is 1920 pixels wide, can be tedious to read. This was fairly trivially accomplished by adding width:800px; and margin-left:215px; to the #content section. The left margin happens to be the exact width of the navigation column, which I will eventually add as well. At some point I'd like to add "greatest hits" links to some of my less terrible posts, ala Maciej Ceglowski a totally chill dude I have mentioned on several previous occasions. I badly want to be him, a goal which can only be brought closer by copying every detail of his site's design.

For the longest time, bbot.org didn't actually have an A record for www, which meant that anyone trying to visit www.bbot.org got a DNS error. This was an obvious and trivial thing to fix, so naturally I guiltily put it off for a solid year before spending the five minutes to fix it. Apache, of course, interpreted clients requesting files from www.bbot.org as an entirely different virtual host, and returned the generic hello world message of an unconfigured virtual host.

The general consensus seemed to be that this problem was solved with mod_rewrite, which was bad sign. You know you're in for a treat when the documentation for a module includes not one, but two ominous quotes on the general inscrutability and hideous complexity of the matter at hand.

Mod_rewrite did not disappoint.

I wanted to use this code from no-www.org:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]

No-www.org wanted me to put this in a .htaccess file. However, I've disabled these in my install of apache2, as 1.) They're meant to be used by end-users, and since I run the server, I can just edit the configuration files directly, 2.) Since they're meant to be used by end-users, they can in fact be used by end-users, and present a possible security threat should a priviledge-escalation attack be discovered against Apache .htaccess files, and a malicious user exploits another unknown attack to upload a .htaccess file to my htdocs directory, which is an amazingly obscure threat vector but hey lock it down anyway, and 3.) Since .htaccess files can appear, or be edited, at any time, Apache has to check to see if one exists every time it serves a request. As the owner of several low-traffic sites that run on a server whose load never breaks 0.00, I'm all for pointless micro-optimizations.

Hurdle one was that Apache didn't actually have mod_rewrite loaded, so it threw an error when I told it to turn it on. Adding "LoadModule rewrite_module modules/mod_rewrite.so" threw a different error, and some googling revealed that Debian, in its endless quest to annoy its users, required you to specify the full path to the module, or "LoadModule rewrite_module /usr/lib/apache2/modules/mod_rewrite.so". Also note, hilariously, that the name of the module is different from the file name. You only use the module name once, in the LoadModule directive, and only then to point at the .so file. You can't just load the file directly, of course, you have to name it first. And then you use a third name, "RewriteEngine", when you actually turn it on!

Some more hilarious fumbling around at this point revealed that it only worked when it was in the virtual host's conf file, (Which, in debian, are stored in sites-enabled/, not vhosts/!) even if you used the RewriteOptions Inherit directive. Once it place, it finally worked, except it would mangle URLs in a specific way when coming from www.

http://www.bbot.org/about.html would become http://bbot.org//about.html It would rewrite the URL correctly, and Apache would handle it just find, but it would add an extra slash between the TLD and the file name. Some staring at the regex ensued, until it became apparent that the slash between %1 and &1 in the RewriteRule was the problem. I tried reporting the bug to the no-www.org contact address, but it bounced. They list the same contact address in the domain WHOIS, so there's no way to get in touch with them. So in case some other chump is confused by the same bug, hopefully they'll find this post.


Posted by | Permanent link | File under: Linux