<?xml version="1.0" encoding="utf-8"?>
        <?xml-stylesheet type="text/css" href="http://bbot.org/blog/styles/feed.css"?>
<rss version="2.0"
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:admin="http://webns.net/mvcb/"
 xmlns:atom="http://www.w3.org/2005/Atom"
>
<channel>
<title>Filed under: Linux | the bblog</title>
<atom:link href="http://bbot.org/blog/archives/linux/index-rss.xml" rel="self" type="application/rss+xml" />
<link>http://bbot.org/blog</link>
<description>complaining, nerdery, errata</description>
<dc:language>en-us</dc:language>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:date>2013-03-22T11:24:46-07:00</dc:date>
<admin:generatorAgent rdf:resource="http://nanoblogger.sourceforge.net" />

<item>
<link>http://bbot.org/blog/archives/2013/01/04/building_dtwenty_org/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2013/01/04/building_dtwenty_org/</guid>
<title>building dtwenty.org</title>
<dc:date>2013-01-04T21:03:31-07:00</dc:date>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p><a href="https://news.ycombinator.com/item?id=1946405">767 days ago,</a> I commented on a HN submission about a random number
  generator:

  <blockquote>3.) Providing random numbers as an advertisement for your fine
  line of hardware random number generators. Here it doesn't matter how much
  money you make [providing the numbers], you just want people to buy the hardware that made
  them. Oddly enough, none of the random number services (and there are quite
  a few) do this, for some inexplicable reason. There's not even an
  argument-from-proprietary technology, since HRNGs are supposed to generate
  perfectly random noise, and there's no way an attacker could stage a replay
  attack.</blockquote>

<p>I left it there, because I was lazy. But last month, notorious badass
  Maciej Cegłowski
  created <a href="http://static.pinboard.in/prosperity_cloud.htm">The
  Pinboard Co-Prosperity Cloud.</a>

<blockquote>
<p><b>What is it?</b>

<p>The Pinboard Co-Prosperity Cloud is a startup self-incubator. Six successful applicants will receive a modest amount of funding and as much publicity as I can provide for their sustainable and useful business idea.

<p><b>Is this a joke?</b>

<p>It is not a joke.

<p><b>What are the requirements?</b>
<p>You must have a good idea that you are capable of building, a willingness to build it, and a plan for making it mildly profitable.

<p><b>How much funding will I get?</b>

<p>Each successful applicant will receive $37. This will cover the cost of six
  months of hosting at prgmr.com and a productivity-enhancing hot
  beverage.</blockquote>

<p>So I entered. Ha ha why not?

<p>The more I thought about it though, the more I realized that I wasn't getting the
  joke. The idea was trivially simple. I already had a web server. I didn't
  need all that mad cash. I could just... build it.

<p>So I did. <a href="//bbot.org/dtwenty">It's right here.</a> (<b>EDIT 2013/3/22:</b> I let the domain name lapse, and moved the content to bbot.org)

<p>It was amazingly easy, even though this was pretty much my first major (har) piece
  of programming. I had never used python, javascript, or jquery before.

<p>Web programming in the year 2012 has the smooth, well polished feel of
  something that has had the sharp edges worn off by the passage of thousands
  of other people. Getting nginx to talk to the WSGI
  server was a snap. Installing bottle.py was easy. JQuery was no problem. 

<p>Any time I had a problem, googling the error message would return a helpful,
  relevant page, explaining how my "build it as fast as possible, while learning
  as little as possible" design methodology had screwed me over again.

<p>At the time, of course, it seemed a vast edifice
  of impossible complexity, but in retrospect it was painless. "It's easy to
  do if you know how to do it", maybe.

<p>The only difficulty I faced was the hardware random number generator. The
  numbers had to come from it, since that was the whole point of the site; but
  my server was a virtual machine on the east coast, and my HRNG was sitting
  on my desk.

<p>The "money" solution would be to buy a rackmount server, plug the widget
  into it, then slot it into a colo, but I didn't have money, and instead I had to be
  creative.

<p>I couldn't just run the web server locally, since my ISP blocks port
  80. Enter the ugly hack: I plugged the entropykey into a spare laptop, ran
  the application server on that, then ran a SSH tunnel to my web server, which communicates with the front end via JSON. It
  works, at the cost of an extra 150ms of latency per roll.

<p>There's room to improve, of course. You could probably list off a dozen
  features dtwenty.org needs without pausing to draw breath, (starting with "make it
  less ugly") but, the ideal of
  the <a href="https://en.wikipedia.org/wiki/Minimum_viable_product">minimum
  viable product</a> shines bright.

<p>The second biggest problem after integrating the HRNG was the ad copy that
  makes up most of the page. It was originally twice as long-- ruthless
  editing has reduced to it merely "too long" from "far, far too long." This
  too could use improvement.

<p>But! It's done and it works! Programming is fun.]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/10/15/why_does_nanoblogger_generate_broken_links/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/10/15/why_does_nanoblogger_generate_broken_links/</guid>
<title>why does nanoblogger generate broken links</title>
<dc:date>2012-10-15T18:24:59-07:00</dc:date>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:subject> Linux</dc:subject>
<description><![CDATA[<p><i>(Attention conservation notice:</i> I found an obscure bug in my blog
  publishing software. You are unlikely to care about it.)

<p><img alt="google 404 errors" src="//bbot.org/blog-images/404.png">

<p>Why the hell does my site have so many broken links?

<p>I'll spare you the grimy details of the hour of troubleshooting, and jump right to the
  punchline. <b>Nanoblogger 3.4.2 has a bug which generates bad relative links
  when you do <code>./nb update all</code></b>

<p>Nanoblogger is no longer updated, so this isn't a problem that can be
  solved by upgrading. I didn't want to dive into the parsing engine, so I had
  to find a workaround, which turned out to be pretty simple: just update it a
  year at a time. <code>./nb update YYYY</code> works
  perfectly. (ex. <code>./nb update 2012</code>) I've only got six years of
  archives, so all I had to do was run it six times.

<p>I'm posting this incredibly boring post in the hopes it'll save one of the
  six other users of nanoblogger some confusion in the future.]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/09/30/escape_sh/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/09/30/escape_sh/</guid>
<title>escape.sh</title>
<dc:date>2012-09-30T07:13:13-07:00</dc:date>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:subject> Linux</dc:subject>
<description><![CDATA[<p>I've pasted a lot of IRC logs into a lot of HTML documents, which is always
  a pain, since angle brackets are obviously a special character in HTML,
  which means I have to do a search and replace with the equivalent entity
  codes. I usually did this manually, using whatever graphical text editor was
  handy.

<p>But
  that's <a href="http://www.catb.org/esr/faqs/hacker-howto.html#believe3">Not
    The Hacker Way.</a> I'm editing a text file produced by one program, so
  another program will accept it. String processing isn't a job fit for a
  human. This is something that should be done by a <i>third</i> program.

<p>Thus:

<pre class=prettyprint>#!/bin/sh
#
# escape.sh - Escapes angle brackets in text files
#
# Turns angle brackets into &amp;lt; and &amp;gt; HTML entities.
# With --irc, replaces the first 8 columns (the timestamp) with an 
# opening angle bracket, using an ugly hack.
#
# This is free and unencumbered software released into the public domain.

if [[ $* == *--irc* ]]
then
    sed -i 's/&gt;/\&amp;gt;/g' $2
    sed -i 's/^......../\&amp;lt;/g' $2
else
    sed -i 's/&lt;/\&amp;lt;/g' $1
    sed -i 's/&gt;/\&amp;gt;/g' $1
fi</pre>

<p><a href="https://github.com/bbot/sitetools/blob/master/escape.sh">(Github)</a> 

<p>Then I stuck it in my $PATH with <code>sudo cp escape.sh
    /usr/local/bin/escape</code> This way you can run it from any directory
  just by doing <code>escape example.txt</code>

<p>(It's not actually very Unixy-- it doesn't play well with pipes, and
	 wildcard expansion in a directory will blow it up.)

<p>Have fun!]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/08/25/ntpblogging_ii/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/08/25/ntpblogging_ii/</guid>
<title>ntpblogging II</title>
<dc:date>2012-08-25T19:00:25-07:00</dc:date>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p><a href="//bbot.org/blog/archives/2012/08/04/ntpblogging/">(previously)</a>

<p>So now bbot.org is <a href="http://www.pool.ntp.org/scores/76.72.161.27">a Stratum 2 NTP Pool server.</a> <a href="http://support.ntp.org/bin/view/Servers/BbotOrg">(Its wiki page.)</a>

<p>Joining the pool is pretty easy: You create an account, give them your server's IP address, wait for the monitoring server to decide you're stable enough (~8 hours) and boom, you're in.

<p><img src="//bbot.org/blog-images/ntp-pool-adding.png">

<p>(The interface is a bit awkward: you paste the address in there, you <i>don't</i> click the "Add a server" link, which apparently doesn't do anything.)

<p>I found four upstream servers by pinging 0.us.pool.ntp.org repeatedly, and choosing the one that were closest to me. Since bbot.org is in a datacenter right on the internet backbone, close can be <i>very</i> close:

<pre># ntpq -np
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
-72.26.198.240   209.51.161.238   2 u  273 1024  377    2.320    3.100   1.201
+69.164.217.193  128.59.59.177    3 u  825 1024  377    3.713    0.239   0.371
-108.61.73.243   209.51.161.238   2 u  237 1024  377    3.174   -1.069   0.398
+128.113.28.67   18.26.4.105      2 u  383 1024  377    6.828    0.382   0.141
*128.118.25.5    .WWV.            1 u  426 1024  377   11.537    0.225   0.310</pre>

<p>I had hoped that <10ms ping times would result in magically low offset numbers, measured in the tens of microseconds, but apparently jitter becomes a bigger problem when you get that low.

<p>My reference stratum 1 server is <a href="http://support.ntp.org/bin/view/Servers/WwvTnsItsPsuEdu">wwv.tns.its.psu.edu,</a> an open-access tier 1 server that John Balogh runs. Thanks John!]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/08/04/ntpblogging/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/08/04/ntpblogging/</guid>
<title>ntpblogging</title>
<dc:date>2012-08-04T06:52:57-07:00</dc:date>
<dc:creator>Samuel Bierwagen</dc:creator>
<dc:subject> Linux</dc:subject>
<description><![CDATA[<p>So I was farting about trying to figure out how to ask a NTP server what it thinks the time is without having to edit ntp.conf on the client machine, when I discovered that NTP is like SSH&#8212; any machine with it installed acts as a server.</p>
<p>So now both of my machines get their time from bbot.org:</p>
<pre>magnesium:~ $ ntpq -np
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+76.72.161.27    138.236.128.112  3 u   44   64  377   81.174   -2.459   1.158
+209.177.158.233 134.21.35.167    3 u   42   64  377   62.137   -3.428   1.459
-64.34.171.122   198.60.22.240    2 u   42   64  377   85.527   -7.019   2.248
*69.50.219.51    209.51.161.238   2 u   36   64  377   54.501    0.361  45.780</pre>
<pre>bbot@neon:~ $ ntpq -np
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+76.72.161.27    138.236.128.112  3 u   60  128  377   80.328    0.690  10.561
*67.23.181.241   128.4.1.1        2 u   35  128  377   82.895    1.003   7.373
+69.167.160.102  204.9.54.119     2 u   70  128  377   75.712    4.162  12.724
+50.16.231.185   192.5.41.40      2 u   26  128  377   87.617   -0.344  64.595</pre>
<p>(The legend for the inscrutable linux bullshit can be found in <a href="http://linux.die.net/man/8/ntpq">ntpq&#8217;s manual file</a>)</p>
<p>(Of <em>course,</em> at the moment I took these screenshots, neither neon or magnesium were syncing to bbot.org&#8230;)</p>
<p>The magic incantation to ask a NTP server for the time is <code>sntp</code>,</p>
<pre>$ sntp 0.pool.ntp.org
2012 Aug 04 01:30:15.000857 + 0.001475 +/- 0.083787 secs</pre>
<p>Which will return, (if you&#8217;ve got a machine with accurate time) a tiny drift number (1.4ms) swamped by a giant lake of uncertainty. (83.7ms) This is because sntp can only request a single packet, which means it doesn&#8217;t have a good idea of the jitter between you and the remote machine. The NTP daemon manages to extract accurate time from the storm of random network noise by requesting lots of packets, <a href="http://www.ntp.org/ntpfaq/NTP-s-algo.htm">then doing a lot of clever things.</a> You can sanity-check <code>sntp</code> by going to <a href="http://time.is/">time.is</a>, which for any Linux machine will tell you that your clock is bang-on accurate. (NTP on Windows <a href="http://support.microsoft.com/kb/939322">will only get the clock within 1000ms of the true time, by design.</a> Microsoft assumes that their users aren&#8217;t pedantically obsessive nerds who absolutely positively must have the most accurate computer clocks possible. The same assumption can&#8217;t be made of Linux users.)</p>]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/02/16/using_applicationecmascript_on_nginx/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/02/16/using_applicationecmascript_on_nginx/</guid>
<title>using application/ecmascript on nginx</title>
<dc:date>2012-02-16T09:12:36-07:00</dc:date>
<dc:creator><a href=&quot;http://bbot.org/&quot; rel=&quot;author&quot;>Samuel Bierwagen</a></dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p>So while I was browsing the Wikipedia article on <a href="https://en.wikipedia.org/wiki/Internet_media_type">Internet Media Types,</a> (I am a very boring person) I noticed this:</p>
<blockquote>
<ul>
<li><code>application/ecmascript</code>: <a href="https://en.wikipedia.org/wiki/ECMAScript" title="ECMAScript">ECMAScript</a>/<a href="https://en.wikipedia.org/wiki/JavaScript" title="JavaScript">JavaScript</a>; Defined in <a class="external mw-magiclink-rfc" href="https://tools.ietf.org/html/rfc4329">RFC 4329</a> (equivalent to <code>application/javascript</code> but with stricter processing rules)</li>
</ul>
</blockquote>
<p>Hey now, what's all this then? "Stricter processing rules?" I <em>love</em> stricter processing rules! Let's look at that RFC:</p>
<pre>3.  Deployed Scripting Media Types and Compatibility

   Various unregistered media types have been used in an ad-hoc fashion
   to label and exchange programs written in ECMAScript and JavaScript.
   These include:

      +-----------------------------------------------------+
      | text/javascript          | text/ecmascript          |
      | text/javascript1.0       | text/javascript1.1       |
      | text/javascript1.2       | text/javascript1.3       |
      | text/javascript1.4       | text/javascript1.5       |
      | text/jscript             | text/livescript          |
      | text/x-javascript        | text/x-ecmascript        |
      | application/x-javascript | application/x-ecmascript |
      | application/javascript   | application/ecmascript   |
      +-----------------------------------------------------+</pre>
<p>So, it looks like a typical web "standards" clusterfuck. I had been wondering why nginx served Javascript with the application/x-javascript mime-type, <a href="https://tools.ietf.org/html/draft-ietf-appsawg-xdash-02">which was used for non-standard protocols,</a> and now I know.</p>
<p>The only use of JavaScript on bbot.org, <a href="https://code.google.com/p/google-code-prettify/">prettify.js,</a> a Javascript-based code prettyprinting package, is actually a prime use for bleeding edge standards wankery. Code highlighting is pure <a href="https://en.wikipedia.org/wiki/Progressive_enhancement">progressive enhancement</a> so the readers still using IE6 on their 2002-vintage PowerPC iMacs shouldn't miss anything.</p>

<p>(My prettify.js implementation is actually backported from my <a href="http://catb.org/jargon/html/S/signal-to-noise-ratio.html">low-signal, high-noise</a> adjunct blog, where I wrote <a href="http://c1qfxugcgy0.tumblr.com/post/17312673810/add-fast-clever-syntax-highlighting-to-tumblr">a pair</a> <a href="http://c1qfxugcgy0.tumblr.com/post/17363683243/wait-a-minute">of posts</a> on the topic.)

<p>Enabling it on nginx is fairly easy. Just add a line to <code>/etc/nginx/mime.types</code>:</p>

<pre>    application/ecmascript                es;</pre>

<p>This will deliver any file with the .es extension as application/ecmascript, which should Just Work in any modern browser. Pow! Whammo! Easy!</p>

<p>However, nginx by default will serve it uncompressed, and with a fairly short cache lifetime. Let's change that.

<p>First, add <code>
    application/ecmascript
    </code>
  to wherever you keep your <code>
    <a href="http://wiki.nginx.org/HttpGzipModule#gzip_types">gzip_type</a>
    </code>
  declarations. (In my case, <code>
    conf.d/compression.conf
    </code>
  ) Next, tell nginx to deliver it with the proper cache headers. In my case, I already had a <code>
location{}
</code> block inside my <code>
    server{}
    </code>
  virtual host block that did that for a bunch of filetypes, so I added es to it:
</p>

<pre>location ~* \.(?:ico|css|js|gif|jpe?g|png|es)$ {
  expires max;
  add_header Cache-Control public;
}</pre>

<p>This matches against a bunch of extensions, (including both jpg and jpeg) and delivers them with the Cache-Control: public header, as well as the maximum allowable time for the Expires header, which for nginx is Thu, 31 Dec 2037 23:55:55 GMT. This is a bit silly, since any browser will request a new copy long, long before the year 2037, but hey, why the hell not. Here's what the complete headers look like now:

<pre>bbot@neon:~$ curl -I --compressed bbot.org/prettify.es
HTTP/1.1 200 OK
Server: nginx/1.1.8
Date: Thu, 16 Feb 2012 13:48:01 GMT
Content-Type: application/ecmascript
Last-Modified: Thu, 16 Feb 2012 12:53:10 GMT
Connection: keep-alive
Vary: Accept-Encoding
Expires: Thu, 31 Dec 2037 23:55:55 GMT
Cache-Control: max-age=315360000
Cache-Control: public
Content-Encoding: gzip</pre>]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/01/14/fun_and_games_with_unix_pipes/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/01/14/fun_and_games_with_unix_pipes/</guid>
<title>fun and games with unix pipes</title>
<dc:date>2012-01-14T21:07:46-07:00</dc:date>
<dc:creator><a href=&quot;http://bbot.org/&quot; rel=&quot;author&quot;>Samuel Bierwagen</a></dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p>So <a href=http://explosivetheorist.tumblr.com/>Atomic's</a> started a <a href=http://legenndary.tumblr.com/>new thing.</a> The <a href=http://legenndary.tumblr.com/post/15781682262/roppongi-hills-japan>first post</a> is interesting, however, probably not in the way she intended.

<p>It consists of an image thumbnail named <code><a href=http://data.tumblr.com/tumblr_lxr0pclKpO1rn6clco1_500.gif>tumblr_lxr0pclKpO1rn6clco1_500.gif</a></code>, which links to the larger version, <code><a href=http://data.tumblr.com/tumblr_lxr0pclKpO1rn6clco1_1280.gif>tumblr_lxr0pclKpO1rn6clco1_1280.gif</a></code>. _500.gif is odd in several ways. For one, it's actually a JPEG, delivered with the image/jpeg mime-type. Secondly, it's <em>huge,</em> weighing in at 1,369 kilobytes... for a 500x346 pixel thumbnail. The original GIF is only 147 kilobytes, which makes the thumbnail <em>nine and a half times larger</em> than the full size file.

<p><a href=http://bbot.org/blog/archives/2011/11/05/shooting_yourself_in_the_foot_with_great_verve_and_accuracy/>We've been down this road before.</a> Let's take a look at the file.

<pre>exiftool -htmlFormat -v tumblr_lxr0pclKpO1rn6clco1_500.gif &gt; <a href=http://bbot.org/projects/report.html>report.html</a></pre>

<p>If you look at that report, you'll see that the first 57,324 bytes are a perfectly normal quality 92 JPEG file, of an entirely sane size for a 500x346 image. And then there's 1,344,572 bytes of "unknown trailer", which starts with 0xffd9, the JPEG magic number. Let's do a quick <a href=https://github.com/tmbinc/bgrep>bgrep...</a>

<pre>bbot@neon:~$ bgrep ffd9 tumblr_lxr0pclKpO1rn6clco1_500.gif 
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0000dfea
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0001c111
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0002a300
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000385b0
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0004690c
tumblr_lxr0pclKpO1rn6clco1_500.gif: 00054d19
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0006318d
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000716aa
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0007fc70
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0008e2fe
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0009ca24
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000ab212
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000b9a00
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000c8126
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000d67b4
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000e4d7a
tumblr_lxr0pclKpO1rn6clco1_500.gif: 000f3297
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0010170b
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0010fb18
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0011de74
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0012c124
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0013a313
tumblr_lxr0pclKpO1rn6clco1_500.gif: 0014843a
tumblr_lxr0pclKpO1rn6clco1_500.gif: 00156426</pre>

<p>Huh. 24 instances. 24 * 57 kilobytes = 1368, which is about how big our file is. How many frames are there in the original animation?

<pre>bbot@neon:~$ identify tumblr_lxr0pclKpO1rn6clco1_1280.gif | wc -l
24</pre>

<p>Somehow, when producing the 500 pixel thumbnail, Tumblr managed to produce a thumbnail for each individual animation frame, then concatenated all of them.

<p>Wow.

<p>Whoops.

<p>How did they do this? Well, I'm guessing it was a pipe.

<p>One of ssh's <a href=http://linux.icydog.net/ssh/piping.php>many, many party tricks</a> is providing a transparent unix pipe between two machines. Presumably Tumblr has a front-end machine that accepts uploads from users, scales it down with Imagemagick, then transfers it to Amazon S3. Here's a one-liner that replicates the bug:

<pre>$ convert -resize 500 input.gif jpg:- | ssh user@server.example.com "dd of=output.gif"</pre>

<p>It's less obvious <em>why</em> this command is being executed. Additionally, while it replicates the bug, it doesn't produce the exact same file, it's about 141 kilobytes smaller.

<p>One possible reason is that Imagemagick chokes on the original file, becoming extremely confused when you ask it to scale the overlay frames. Given this command, which should Just Work:

<pre>convert -resize 500 -layers optimize tumblr_lxr0pclKpO1rn6clco1_1280.gif 500.gif</pre>

<p>Produces this:

<p><img src=http://bbot.org/blog-images/500.gif></p>

<p>Which is both extravagantly broken, and ten times larger than the original, larger in image dimensions, file. So Tumblr might have added a step in their asset pipeline to normalize certain GIF animations that Imagemagick chokes on.

<p>(There might be a more graceful way to do this than converting it to a Motion PNG. If there is, tell me.)

<pre>convert input.gif mng:- | convert -resize 500 -layers optimize - output.gif</pre>

<p><img src=http://bbot.org/blog-images/501.gif></p>

<p>As you can see, this actually works, though the "thumbnail" is still twice the size of the original file.

<p>Now, (putting ourselves in the shoes of the nameless sysadmin who was doing this) let's add the next step, where we actually upload the file to the remote server. Except that, whoops! We were hacking on the JPEG thumbnail code earlier, and we accidentally tell Imagemagick to send the image data as JPEG.

<pre>convert input.gif mng:- | convert -resize 500 - jpg:- | ssh user@server.example.com "dd of=output.gif"</pre>

<p>And so, we end up with this ridiculous situation where the thumbnail is nine times bigger than the original file. I guess the moral of the story is to always check to make sure that something which is supposed to make files smaller, <em>actually</em> makes files smaller.

<p><a href=http://c1qfxugcgy0.tumblr.com/post/14692117697/fun-with-the-command-line>(Previously.)</a>]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2012/01/08/when_pretty_secure_isnt_secure_enough/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2012/01/08/when_pretty_secure_isnt_secure_enough/</guid>
<title>when &quot;pretty secure&quot; isn't secure enough</title>
<dc:date>2012-01-08T14:58:11-07:00</dc:date>
<dc:creator><a href=&quot;http://bbot.org/&quot; rel=&quot;author&quot;>Samuel Bierwagen</a></dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p><a href=http://www.osnews.com/story/25469/Richard_Stallman_Was_Right_All_Along>"Richard Stallman Was Right All Along"</a>

  <blockquote>"As a member of the Walkman generation, I have made peace with the fact that I will require a hearing aid long before I die, and of course, it won't be a hearing aid, it will be a computer I put in my body," Doctorow explains, "So when I get into a car - a computer I put my body into - with my hearing aid - a computer I put inside my body - I want to know that these technologies are not designed to keep secrets from me, and to prevent me from terminating processes on them that work against my interests."</blockquote>

<p>Something I've been thinking about off and on for the last seven years or so, is what the security model for an em would look like.

<p>Background info, for non-transhumanists: "Em" is a short, pithy word coined by <a href=https://en.wikipedia.org/wiki/Robin_Hanson>Robin Hanson</a> to refer to a person <a href=https://en.wikipedia.org/wiki/Whole_brain_emulation>running on a computer.</a> The basic idea behind Whole Brain Emulation is to scan a human brain with an electron microscope, then make a model of all the scanned atoms, and run that model in a physics simulator, which will run all the chemical interactions between neurons like it was a physical brain. This model will have all the memories of the person that was scanned, but has all the advantages of software: functional immortality, easy copying, can be run millions of times faster than real time...

<p>The problem arises when you start to think about what kind of computer you're going to run this simulation on. It must be completely, <em>flawlessly,</em> secure. It absolutely <em>cannot</em> be hacked, because once you lose control of that computer, that's the ballgame. A copy of your brain-state is <em>you.</em> It's got all your memories, knows all your passwords.

<p>That's bad. It gets worse: a brain-state is software, it can't "die" in the organic sense of the word. You could torture it to death, over and over, for a thousand years; if you felt like it. <a href=http://www.infinityplus.co.uk/stories/colderwar.htm>"They populate the simulation spaces of its mind, exploring all the possible alternative endings to their life."</a>

<p>So it's pretty clear that the operating system for a em is going to have be very special indeed. <a href=http://www.tomshardware.com/reviews/qubes-os-joanna-rutkowska-windows,3009.html>Quebes</a> isn't paranoid enough. <a href=http://openbsd.org/>OpenBSD</a> isn't paranoid enough. <a href=http://ertos.nicta.com.au/research/sel4/>seL4</a> isn't paranoid enough. You will need a degree of paranoia hitherto unseen outside of nuclear weapons safety protocols and space shuttle flight control systems. Multiple, concentric, airgapped systems. ASICs that <a href=https://en.wikipedia.org/wiki/Smart_card>refuse to export their contents.</a> Physical safety interlocks. <a href=https://en.wikipedia.org/wiki/Power_analysis>Power draw monitoring.</a> (Here being used in a somewhat unusual way: monitoring the power draw of a secure processor to verify that it <em>hasn't</em> been compromised) <a href=https://en.wikipedia.org/wiki/Formal_verification>Provably secure code.</a> Self-destruct charges!

<p>Some of the sting of "killing yourself rather than be captured by the enemy" is taken out by having a couple dozen copies as backup, however.

<p>This is a bar set amazingly, impossibly high; and it goes absolutely without saying that no general-purpose commercial OS clears it. However, many of the freedom-destroying technologies cut both ways. The Xbox 360, which has been out for six years now, uses <a href=https://en.wikipedia.org/wiki/Code_signing>code signing</a> to enforce a closed platform. Downside: no third-party software, at all. Upside: there has never been a virus on the 360. (apt-get uses a weak form of code signing, and to the best of my knowledge, has never distributed a virus either)

<p>The <a href=https://en.wikipedia.org/wiki/Trusted_Platform_Module>Trusted Platform Module</a> can be used to build a computer which you can <a href=https://www.gnu.org/philosophy/can-you-trust.html>only install Windows on,</a> but can also <a href=http://www.h-online.com/open/features/What-s-new-in-Linux-3-2-1400680.html?page=2>be used by Linux</a> to protect against certain attacks.

<p>This blog post doesn't really have a <em>point,</em> I just wanted to talk about some stuff. Sorry.

<p>I'm certainly not saying that there's some kind of tradeoff between <em>open-source</em> and <em>security.</em> That would just be utter, blithering nonsense. I guess if there's any point here that I'm flailing in the direction of, it's that there are certain dual-use technologies, which are in danger of being misused by people looking to make money at the expense of the users; also known as the Facebook strategy.]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2011/06/10/adventures_in_html_optimization/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2011/06/10/adventures_in_html_optimization/</guid>
<title>adventures in HTML optimization</title>
<dc:date>2011-06-10T09:41:29-07:00</dc:date>
<dc:creator><a href=&quot;http://bbot.org/&quot; rel=&quot;author&quot;>Samuel Bierwagen</a></dc:creator>
<dc:subject> important, Linux</dc:subject>
<description><![CDATA[<p><img src="http://bbot.org/blog-images/page-speed.png">[1]

<p>Last week, someone linked me to <a href="http://pagespeed.googlelabs.com/">Google Page Speed.</a> This sucked, since it directly resulted in me spending rather a lot more time than strictly necessary dicking around with Apache configuration files.

<p>My server doesn't get a whole lot of traffic,[2] so I hadn't bothered setting <a href="http://httpd.apache.org/docs/2.0/mod/mod_expires.html">Expires:</a> headers, under the "who cares" school of thought. When Apache CPU utilization doesn't get above .01% when you're getting 5 hits a second from HN, there's not a whole lot of incentive to aggressively cache files. But when I ran bbot.org through Page Speed, I received the humiliating news that it only scored <strong>68/100.</strong> 68! That's a low number!

<p>Resolving most of the issues was easy, (Turning on Expires, bzip compression, changing the black and white logo image from full 24-bit color to grayscale, etc.) but going from 98/100 to 100/100 was kinda painful.

<p>Page Speed is a vast improvement over <a href="http://developer.yahoo.com/yslow/">YSlow,</a> but it shares some of the inherent problems of an automated performance tool.

<p>For one, it doesn't seem to care much about <em>actual</em> page load times, as long as you follow its rules. It gives cracked.com, ("Auschwitz for javascript engines") <a href="http://pagespeed.googlelabs.com/#url=http_3A_2F_2Fcracked.com_2F&mobile=false">a phat 90/100,</a> even though the front page takes 8.2 seconds to load, makes 188 HTTP requests from about a billion domains, loads 36 javascript files, throws 166 warnings in Chrome's audit tab, has 466 unused CSS rules, and is, in fact, pure evil. For a period of time while I was testing things, the links div loaded its own font-face, the smallcaps version of <a href="http://www.linuxlibertine.org/index.php?id=86&L=1">Linux Libertine.</a> Now, anyone with half a brain can tell you that loading a 300 kilobyte font just to style 20 words of text, something you can just do in CSS with <a href="http://www.w3schools.com/css/pr_font_font-variant.asp">font-variant:small-caps</a> anyway, is pants-on-head retarded; but Page Speed was totally fine with it.

<p>For two, while it doesn't actually come out and <em>say</em> that you should "Use a CMS" like YSlow does, (A monumentally useless piece of "advice") it sure does wink a lot and nod suggestively <em>towards</em> it.

<p>The sticking point, that robbed me of two points and kept from that tantalizing perfect score, was <a href="http://code.google.com/speed/page-speed/docs/rtt.html">"Inline small CSS".</a> I, of course, kept the stylesheet in an external file, and linked to it from every page, because it's easier to maintain that way. Except, of course, it's a <em>small</em> stylesheet, and page rendering would be faster if you just stuck it in the HTML file. This would be a pain... unless you used a CMS, which could just seamlessly inline a stylesheet when publishing a document. Funny.

<p>Inlining the CSS granted me the two points, but then page speed turned right around and docked me one, since the stylesheet had a lot of whitespace, and pushed the file over the tipping point where Google thought it would be worthwhile to <a href="http://en.wikipedia.org/wiki/Minification_(programming)">minify</a> the source code. Now <em>this</em> is a pain, since I hand-edit my code, and minifying makes code hideously ugly. Would have been trivially easy to do if I used a CMS, of course. Minifying was tedious and fiddly, since <a href="http://kangax.github.com/html-minifier/">the tool I used</a> liked to munge the inlined CSS, and scribble all over my link formatting. It was worth it, though! After twenty minutes of swearing, I finally trimmed off that last 120 bytes, and scored a perfect 100/100! Yeah!

<p>Now, if you'll excuse me, I have to go put everything back to the way it was.

<hr>

<p>1: Google showing off their mad UI skillz there on the "refresh results" button.

<p>2: How much traffic does it get? Last month, my host called in a panic. Apparently, my box had consumed 1000% more bandwidth than it did from the month before-- it had used up 10 megabytes! My site doesn't get a lotta traffic, I tell ya, every page load takes 30 seconds, because the disks end up spinning down between hits! My traffic is so low, my Alexia site rank is measured in scientific notation! It's low, I tell ya!]]></description>

</item>
<item>
<link>http://bbot.org/blog/archives/2011/05/05/web_standards_deathmatch/</link>
<guid isPermaLink="true">http://bbot.org/blog/archives/2011/05/05/web_standards_deathmatch/</guid>
<title>web standards deathmatch</title>
<dc:date>2011-05-05T00:03:06-07:00</dc:date>
<dc:creator><a href=&quot;http://bbot.org/&quot; rel=&quot;author&quot;>Samuel Bierwagen</a></dc:creator>
<dc:subject> Linux</dc:subject>
<description><![CDATA[<p>Shit's fucked! God damn it, <a href="http://www.whatwg.org/">WHATWG!</a>

<p>Let me start at the beginning.

<p>A not terribly well known feature of OpenID is <a href="http://openid.net/specs/openid-authentication-1_1.html#delegating_authentication">delegated authentication.</a> In brief, it's two &lt;link&gt; elements that point to an OpenID identity provider, in my case, myopenid.com. When I poke "bbot.org" into the login form on a site that accepts OpenID, (in specificationese, an "OpenID consumer") it's supposed to follow the links to myopenid.com, (an "identity provider") which does all the heavy lifting, and in the end, I show up as "bbot.org". I can change the identity provider to whatever, (google, AOL, livejournal) and still retain the same identity, which is actually pretty neat. There's also the benefit of hiding myopenid's ugly URL. (bbot.myopenid.com, ugh)

<p>So far, so easy.

<p>However, in HTML 4.01, &lt;link&gt; elements are supposed to be contained in a pair of matched &lt;head&gt;&lt;/head&gt; tags. But people are fallible, and so they tend to forget to close tags, or they misspell them, or they leave them out entirely; and so when a browser sees a &lt;link&gt; tag all by itself, it tries to do the right thing, and just parses it, rather than erroring out.

<p>This kind of thing just drives the variety of person who writes specifications for programming languages up a wall. Zere vill be <em><a href="http://www.plasticbrickautomaton.com/?id=60">order</a></em> in mein markup language! So the W3C wasted no time setting up a working group for XHTML, which specifies that tags must always be closed, everywhere, and single element tags like &lt;br&gt; have to be "self closing", viz. &lt;br /&gt;. This didn't make a whole lot of sense, but the standards wonks waved their hands a lot and repeated "XML" a few dozen times, and since this was 2005, back when XML was shit-hot and everybody loved it, this settled the argument.

<p>However, programmers (like those at wordpress.com) tended to use the special tags that declared their documents to be XHTML in their templates, because they deeply cared about web standards, and wanted to promote their use. Then they would pass them off to customers, who would write outrageously malformed code, then <a href="http://www.shamusyoung.com/twentysidedtale/?p=994">complain when it failed validation.</a>

<p>Obviously the problem with this situation was ideological impurity. Since users didn't care about validation, the W3C announced, they would <em>make</em> them care. XHTML 2.0 would require that browsers would stop rendering and display an error message at the <em>very first</em> parsing error. That'll show them!

<p>Unfortunately, the W3C doesn't make browsers. They just write standards. Browsers are actually written by software vendors such as Apple, Google, Microsoft and the Mozilla Foundation. And none of these groups were terribly enamoured with a markup language that didn't really resemble HTML at all, was enormously fragile, and used by exactly nobody on the internet.[1]

<p>So the browser vendors took their toys and went home, forming the <a href="http://www.whatwg.org/">WHATWG,</a> and started work on HTML 5, completely bypassing XHTML 2. It rapidly became apparent that nobody was actually going to implement XHTML 2, so the W3C <a href="http://www.zeldman.com/2009/07/02/xhtml-wtf/">killed it off.</a>

<p>My more boring readers will recall that I recently <a href="http://bbot.org/blog/archives/2010/11/14/further_adventures_in_server_administration/">rewrote the front page of bbot.org</a> to validate as HTML 5. One of the more amusing tricks of HTML 5 is that the &lt;html&gt;, &lt;head&gt;, and &lt;body&gt; tags aren't actually required, since the browser has to render a page correctly even if they're missing. I obligingly removed them, then chortled to myself as the page validated perfectly. (Standards wonks find their kicks in odd places.)

<p>Except! No! As the even <em>more</em> boring among you noticed immediately, the OpenID spec says the delegated authentication links have to be inside a &lt;head&gt; element! Damn you, OpenID!

<p>This is a bug that has taken me seven months to discover, mostly because <a href="http://www.google.com/search?q=site%3Anews.ycombinator.com+openid+failure">nobody uses OpenID anymore.</a> I'd complain about compromising my perfect garden of pure ideology to make delegated authentication actually work, but that would be too ironic for words.

<hr>

<p>1: There's a whole bucket of implementation issues, too. Pop quiz, hotshot. What do you do when some user innocently forgets to close a tag in a comment on a blog post? Should the invalid markup in the comment wang the entire page? XML parsing is seriously expensive, computationally. Are you going to write a parser that checks the validity of every comment by Disqus' <a href="http://blog.disqus.com/post/5192492910/the-numbers-of-disqus">35 million users,</a> then reads the user's mind to figure out how they actually wanted to markup the text?</p>]]></description>

</item>
</channel>
</rss>
