July 15, 2008

Backwards Thinking

When I think about it, it strikes me as odd that someone who thinks E-Ink and modern languages are the best thing since transistors would enjoy a paperback on antiquated compiler techniques. Of course, it's hard to argue when I haven't been able to put the thing aside for the last week. Simply put, Knuth's "Selected Papers on Computer Languages" reads like a novel and is, in my humble opinion, better than his other books. The real difference is in flow (also, I find algorithm complexity immensely dull, but I digress). While TAoCP holds an overwhelming amount of information, the task of documenting that amount of material is simply too much; it just feels rushed. This collection, however, is Knuth in his element. Many of the papers were never formally presented; some are letters, others are just tidied-up notes. It's a very honest writing style and a real pleasure to read because of it. I find myself forgetting that some of the work I'm reading fundamentally changed the world of compilers when it was originally published. Here's to my favorite dead tree in the last several years.

July 4, 2008

Boilerplate is basically bad

or "How Huffman should help my hurting hands"

Typing breaks make me irritable. Usually, I occupy myself by finding someone to bother and striking up a conversation, but this week it seems I was alone in deciding not to vacation for the entire work week. Yesterday, this confluence came to a head and produced a brain-tangent concerning the volume of text that programming demands (I.E.- way too freaking much). Not being one to let things go even when I should, I spent the evening creating tools to shortcut some of the common things I do. In the end, I met with only moderate success, and I doubt I will actually adopt any of them during my day-to-day tasks, but I did come out with a better understanding of why I am so annoyed by the status quo.

A programs does two things: 1) it tells a computer what to do, and 2) it tells a reader what you told the computer to do. An efficient program does both as concisely as possible. The phrase "as X as possible" should tip you off to the fact that this is an optimization problem. More specifically, it has become an information theory optimization problem in the vein of Kolmogorov Complexity. Of course, it's not computable because the latter evaluation criterion requires modeling the human brain, but the point remains. Why should I write this:
#include <stdio.h>
int main( int argc, char **argv ) {

return 0;
}
When 99% of my code uses that same text? If we were to shove 100 C programs through a Huffman-esque meat grinder, that entire quote (and probably more!) would condense down to a single bit. Of course, it would be unreadable, but I believe I've made it clear that the middle-ground for optimizing our problem is much closer to a single bit than it is to that horrible monstrosity. (For the record, yes, I realize C is guilty of this far more than other languages, but I'm trying to make a point.)

I'll spare you the rest of my ravings on simplicity and coding, but suffice to say, I'll be digging out an information theory textbook and giving this a second look at some point. There's too much code out there, and something needs to be done about it all.

June 26, 2008

Coroutines, continuations, and call/cc

I'll try and set a precedent for myself by starting out with something that's not full of fluff. The other day, I was doing some reading on parallel languages when I came across a paper extolling the benefits of programming with coroutines. I didn't think the paper was particularly well-written, but it did make it quite clear that I had a hazy conception of what they were. So I resolved to correct this -- by implementing them.

For the uninitiated, coroutines are functions that have stateful control flow and, as a corollary, can have multiple entry and exit points. An example:
(define-coroutine (prefixes)
(display "pre") (yield)
(display "over") (yield)
(display "co") (yield)
(escape))

(define-coroutine (stems)
(display "fix ") (yield)
(display "whelm ") (yield)
(display "routine ") (yield)
(escape))

Assume for the moment that "yield" is essentially like "return" in C. The difference is that when a coroutine returns, it doesn't lose its place. It keeps a continuation to where it left off. This essentially gives it an exit point wherever yield is called and a possible entry point immediately after. So when we run this, we get:
(cobegin prefixes stems)
prefix overwhelm coroutine ...
The coroutines pick up right where they left off. It turns out there are some useful situations for this idiom (think producer/consumer semantics or pipes), and the whole thing struck me as a neat way to make some inroads into playing with dataflow models. So off I went.

I'll spare all of the gritty details, but suffice to say that I went through a handful of iterations, each with their various benefits and drawbacks, but in the end, the example above works flawlessly. I threw all of the code up on my homepage.

June 23, 2008

So it begins

I suppose this would be the time to spout off some high-brow tirade about my divine purpose in starting this, but in all honesty, I'm still not entirely sure what my reasons were for creating a blog. The format intrigues me, and I've been quite the fan of the ones I've read, so maybe the real drive is to officially join the community at large. In that case, I'll send a greeting to anyone who happened to land their browser in my corner of the web and offer this piece of advice:
No matter what the world has in store for you, you have two crocodiles on your side.
But the long story behind that will have to wait for another day.