Thoughts on beanstalkd

Yes, I know beanstalkd is old news. But I'm not telling you it's new either.

I've had a few usecases for it, but often I find that beanstalkd in itself is way too limited to accomplish my goals.

One usecase I've had is pooling up a number of jobs in a tube, and then doing them all in one go, because you want to acquire resources sparingly.

The naïve way to solve that would be to simply poll the tube's stats and check how many jobs are in the queue. What I'm suggesting is something like the reserve protocol command, but for tube stats. Something like wait-tube-stats, which would return the stats in YAML when they're updated.

The command should be able to use multiple tubes, like so:

wait-tube-stats <version> [tube1 [tube2 ...]]\r\n

The version argument would be a version returned in the stats. This is best explained with pseudo-code insipred by beanstalkc.py:

stats = None
while run:
    # This might as well have no positional arguments at all.
    stats = bean.wait_tube_stats("footube", "bartube", previous=stats)
    print "Tube %(name)s has %(current-jobs-ready)d jobs" % stats

The previous keyword argument is annotated with a version key, which is incremented each time the tube changes (and could quite possibly wrap around a 32-bit integer storage), so that beanstalkd can return immediately if the client has become out of sync since the last call.

Further, it'd probably be interesting to be able to skip tube specification entirely, so as to monitor all watched tubes (a common case, I would presume.)

Then, finally, the solution to batching N jobs would be:

stats = None
while run:
    stats = bean.wait_tube_stats(previous=stats)
    if stats["current-jobs-ready"] >= num_jobs_batch:
        acquire_resources()
        try:
            while True:
                job = bean.reserve(timeout=0)
                execute_job(job)
        finally:
            release_resources()

Right now, you basically have to reserve a job, get the stats, and release the job again if there weren't enough jobs in the tube, which is a little ugly.

Of course, the client library could fake this behavior, but I feel a solution on the backend would be better.

Also this is the first entry I publish solely using a hacked up rst2html.py with Pygments.


A Saner Way to Look at Program Code

I've seen a lot of people discuss how programming should be done, and the signs of bad code.

For example, some bloke says that you're a bad programmer if you've done this and done that.

I feel people forget one key aspect of code: it's not ever in a frozen state. It, much like natural language, never reaches a state where it is "correct" or "finished."

The previously mentioned post talks about "bulldozer code," which "gives the appearance of refactoring by breaking out chunks into subroutines, but that are impossible to reuse in another context." What I'd like to call that is phase one in a generic abstraction process.

Sure, the programmer who started breaking it up could've done a better job – but if the level of modularity chosen by the programmer at the time was apparently enough, and that's a very, very important quality in programmers – saying "no."

I think it's widely accepted that you could abstract and layer your code 'til kingdom come (Cf. Java :-), but that's not what "industrial" programming" is about at all. Oh well, I digress.

At any rate, what I feel is oftentimes not pointed out or even reflected over is the fact that code is the result of progressive evolution. You're always looking at a specific snapshot of some functionality, not plain code.

I think what makes a good programmer is awareness of this progressive nature of code - the ability to recognize situations where you can be smart, and moreso the situations where you can cut corners.

This, coincidentally, is why I dislike working on other people's code. Finding the "correct" way to evolve a certain piece of code requires that the previous programmer was good enough to cut the job up appropriately, and make the code only as flexible as is realistic.

Sadly I think it's uncommon for programmers to see these things. People generally go with the "I'll do what I'm tasked with and deal with the future when it comes"-take on programming.

It's all a very delicate balance really, and especially one that isn't visible to the product manager. You only ever get a sense of somebody's programming skills once you work with them, or on something they've written.

I've worked on code which aesthetically looks crap, but has a very fine balance in these matters, and I find I value that a lot higher.


RSS 2.0