What's so bad about explicit introspection?

Something that I've noticed programmers do, myself, colleagues and open-source coders included, is to structure programs so they are "testable".

This, I posit, is silly.

Phrased another way, what we do is we structure our programs to be implicitly introspectable. For example, I found myself writing something like this not long ago:

def search(request):
    results = search_get_data(request.args["q"])
    return render(results)

def search_get_data(query):
    return ask_yellow_pages(query)

This is an obvious case of what I'm talking about: implicit introspectability. What I did was to separate the actual logic and the presentation of the result.

While that might sound like a good idea in theory, it in practice results in silly and hard-to-follow abstractions (but not always, of course.)

What I suggest is explicit introspectability: rather than structuring for testing, structure for clarity and allow testing through other explicit means. It could perhaps look something like this:

from foo.utils import local

def search(request):
     results = ask_yellow_pages(request.args["q"])
     local.emit("results", results)  # Where emit is a no-op outside tests.
     return render(results)

I suspect many people will shout "madness!" at this point, because of the "clutter". If so, then consider how the above view probably looks in actual production code:

def search(request):
     query = parse_query(request.args["q"])
     results = ask_yellow_pages(query)
     logger.debug("searched %r: %d results", query, len(results))
     return render(results)

Why is it permissible to "clutter" code with debug log calls and the likes, but not testing calls? Why do we dance around this problem with mock objects and overly convoluted abstraction layers?

I'm not saying this is a technique to be applied in every situation, but I am indeed saying that there are times and places where a simple system like this could really help making good, rigorous tests possible.

The best counter-argument I see is the so-called observer effect. Simply put, if the emit function in our case would malfunction or otherwise alter flow of logics, tests and actual code would have a disconnect.

There's a point to be had there, but is it really a practical problem? And haven't we already breached this barrier with mock objects, fixture data and whatnot?

I should note I'm a novice when it comes to testing practices - I've never bothered to try actual TDD or BDD or BDSM or what-have-you-driven-development in a real project.

So, to conclude, I'd love to hear other people's thoughts on this. Is this just a senseless taboo among programmers? Do people already do it?


Comments
Posted by: Jean-Paul Calderone §

Perhaps you should expand this example, including what actual unit tests for each version might look like, and covering some more complicated (or "real") examples.

Just saying "call this one function sometimes" doesn't mean much. What are the implications?

2010-05-05 @ 18:06:00
URL: http://jcalderone.livejournal.com/
Posted by: ulrik §

I think this sounds really great. I would love to write like this.. I think. I wonder if python-specific introspection is needed to capture emits from nested or maybe even reentrant tests.

2010-05-05 @ 22:06:17
URL: http://kaizer.se
Posted by: Eric §

Intriguing idea. I can see it becoming another tool in my testing toolbox, but it could take some work. I particularly like the way you're passing a label into emit, as if it could stuff a namespace to be checked by the test. Generator objects would have a major problem with emit, but most other things would probably work just fine.

Mock objects would still be necessary, of course, as would some of the refactoring in your example. For example, inlining both render() and ask_yellow_pages would reduce the ease of passing mock data to your render() algorithm.

2010-05-05 @ 22:29:57
Posted by: shi §

Some thoughts...

I think that "structuring code so that it becomes testable" is really a different way of saying "structuring code so that it becomes reusable".
(After all: tests are really reusing production code in a different context.) In that sense I don't see how it could be a bad thing.

A good way of working is to design your public API first ("suppose I discovered this library on the internet, how would I expect to be able to use it?"), and fill in the implementation later. According to some people you really need to test only your public API; the private API will be tested implicitly as it is called by the implementation of the public API => no need for excessive splitting up of functionality.

Instead of inserting local.emit calls for those cases where restructuring for testability/reuse leads to awkward abstractions (not sure if that is ever the case if you restrict yourself to testing your public API), an alternative could be to perform tests by examining log string contents. That way the same log.debug call is useful both for production and in testing. One can argue that examining log files makes it not a unit test anymore, but something like a functional test instead. By examining log files to test functionality you make the logging part of the specification (i.e. you cannot change the format of the logging without breaking the tests).

(disclaimer: I'm a testing n00b too ;) )

2010-05-06 @ 02:05:59
Posted by: Pavel §

If you mock logger.debug calls, it will give you similar effect to the proposed "emit" technique. Or even better, don't mock but instead configure logger to use custom test-oriented log handler.

But hey, Ludvig, excellent post! I will use this technique too now.

2010-05-06 @ 10:27:10
URL: http://github.com/paxan/
Posted by: Robert Collins §

Well, your first code, that you use as a baseline isn't really very testable either. So you're arguing that structured-for-testing code is messy and unclear when you don't have structured-for-testing code, IMO.

This code -
def search(request):
results = search_get_data(request.args["q"])
return render(results)

def search_get_data(query):
return ask_yellow_pages(query)
-
The search method has two global variables - search_get_data and render; you cannot test search properly without substituting or observing that those two globals are called properly; that implies monkey patching or something similar.

I'd probably rearrange your first version to be:
-
def search(request, provider=search_get_data, render=render):
return render(provider(request.args["q"]))

def search_get_data(query):
return ask_yellow_pages(query)
def render(results):
...
-

By doing that you can:
- test what its doing without monkey patching, which lets you do more complicated tests without contortions
- reuse the code with different search providers or renderers

And its easier to see what that one function does, I think, because there is now an expectation of decoupling, so you're writing to an implicit interface rather than to global state.

2010-05-08 @ 01:06:28
URL: http://rbtcollins.wordpress.com
Posted by: Dag §

I could imagine local.emit('descriptive label for point in code', locals()) being useful. I imagine local is some kind of singleton/context local stack, maybe it doesn't really even need any implementation. Just put local={} in utils and maybe clear it before every test?

2010-05-29 @ 12:06:59

Comment the entry:

Name: (required, possibly pseudonym)
Remember me (cookie)

E-mail: (not required, never published, solely for me to reply to you in person)

URL:

Comment:

RSS 2.0