Thoughts on beanstalkd
Yes, I know beanstalkd is old news. But I'm not telling you it's new either.
I've had a few usecases for it, but often I find that beanstalkd in itself is way too limited to accomplish my goals.
One usecase I've had is pooling up a number of jobs in a tube, and then doing them all in one go, because you want to acquire resources sparingly.
The naïve way to solve that would be to simply poll the tube's stats and check how many jobs are in the queue. What I'm suggesting is something like the reserve protocol command, but for tube stats. Something like wait-tube-stats, which would return the stats in YAML when they're updated.
The command should be able to use multiple tubes, like so:
wait-tube-stats <version> [tube1 [tube2 ...]]\r\n
The version argument would be a version returned in the stats. This is best explained with pseudo-code insipred by beanstalkc.py:
stats = None while run: # This might as well have no positional arguments at all. stats = bean.wait_tube_stats("footube", "bartube", previous=stats) print "Tube %(name)s has %(current-jobs-ready)d jobs" % stats
The previous keyword argument is annotated with a version key, which is incremented each time the tube changes (and could quite possibly wrap around a 32-bit integer storage), so that beanstalkd can return immediately if the client has become out of sync since the last call.
Further, it'd probably be interesting to be able to skip tube specification entirely, so as to monitor all watched tubes (a common case, I would presume.)
Then, finally, the solution to batching N jobs would be:
stats = None while run: stats = bean.wait_tube_stats(previous=stats) if stats["current-jobs-ready"] >= num_jobs_batch: acquire_resources() try: while True: job = bean.reserve(timeout=0) execute_job(job) finally: release_resources()
Right now, you basically have to reserve a job, get the stats, and release the job again if there weren't enough jobs in the tube, which is a little ugly.
Of course, the client library could fake this behavior, but I feel a solution on the backend would be better.
Also this is the first entry I publish solely using a hacked up rst2html.py with Pygments.
