Minor Python Annoyances

2024-03-09

There have been a few instances lately in which I've found myself lamenting the number of libraries available to aid in testing Python. I'm getting the feeling again that not only is your test framework making things worse but also the ease of installing libraries exacerbates issues of quality in the design of projects.

`fakeredis`

fully compatible implementation of redis in python

This is a library that cropped up in an otherwise very simple use case. In need of a cache Redis was imported to support just two uses: getting values by key and setting values by key. My issue comes in then with the kind of interface explosion that happens when something like self.redis_client is added to an existing class. Immediately a whole world of possibilities (and bugs!) is open because the interface is literally all of Redis now.

If instead the interface is narrowed to something more like this:

import typing

class CacheLike(typing.Protocol):
    async def get(self, key): ...

    async def set(self, key, value): ...

This leverages Python's typing protocol as a means of doing structural subtyping. I'll admit there are limited benefits except to appease the type checker and act as documentation; duck typing works without it. It would be equally possible to instead use an ABC and move errors to creation time instead of runtime. This would serve more like nominal typing in how you have to subclass them to reap any benefit.

The real value though comes from how I would use it. Instead of the calling code looking like this:

def some_function(redis_conn: Redis):
    key = "some string"
    some_value = redis_conn.get(key)
    some_object.do_thing(some_value)

I might instead write a class to abstract it a bit:

class Cache:
    def __init__(self, handler: CacheLike):
        self._handler = handler

    def get(self, key):
        return self._handler.get(key)

    def set(self, key, value):
        self._handler.set(key, value)

def some_function(cache: Cache):
    key = "some string"
    some_value = cache.get(cached_key)
    some_object.do_thing(some_value)

This does introduce a layer of indirection so that Cache must be constructed with a Redis client connection but it means the call sites are radically simplified to just these two supported methods. Similarly simplified is testing, instead of using a third-party library an in-memory CacheLike implementation can be defined like this:

class FakeCache:
    def __init__(self):
        self._cache = {}

    async def get(self, key):
        self._cache.get(key)

    async def set(self, key, value):
        self._cache[key] = value

It has been a while since I've skimmed the Gang of Four but I think this is analogous to the bridge pattern. It works in this case because the use of Redis is wildly simple. Obviously more involved uses of Redis would require a more nuanced interface but I do think this approach is valuable for forcing you to decide and name what exactly is being done. Rather than littering business logic with convoluted calls like:

pipe = redis_conn.pipeline()
pipe.set("a", "a value").set("b", "b value").get("a").execute()

Which would only be testable with an entire implementation of Redis, giving such an operation a name and stuffing it into a method that can (and should!) be tested in isolation provides a necessary counterweight to how easy it is to really screw things up. Before pulling in over six thousands lines of dependencies just for testing maybe we should spend a little while longer thinking.

What I've found especially nice about the approach is how it changes my own thinking when using all sorts of services. If (some uses of) Redis is really just like a hash-table, how simplified might other things be? Maybe an entire message broker could be boiled down to:

class PubSubLike(typing.Protocol):
    async def publish(self, subject, message): ...

    async def subscribe(self, subject, queue, callback): ...

class FakePubSub:
    def __init__(self):
        self.subscriptions = collections.defaultdict(list)
        self.published = collections.defaultdict(list)

    async def publish(self, subject, message):
        self.published[subject].append(message)

    async def subscribe(self, subject, queue, callback):
        self.subscriptions[(subject, queue)].append(callback)

I'm not suggesting the above are completely representative for all testing scenarios. Instead I think they are sufficient to address well-designed test cases. Rather than making a complete re-implementation of the service you might make 3 or 5 that produce the exact behavior required to exercise the code under test. Not unlike some ideas I described in a rant about integration testing how about a PubSubLike that needs to produce timeouts connecting to the messaging server?

class TimeoutPubSub:
    async def publish(self, subject, message):
        raise TimeoutError

    async def subscribe(self, subject, queue, callback):
        raise TimeoutError

Using these sort of lightweight fake implementations likely requires some design work within application code to support it in testing. Python of course lets you do all sorts of gross patching but just because it is possible doesn't mean it is a good idea. Relying on these patterns leads to favoring composition and injecting dependencies. This leads to my second annoyance:

`freezegun`

a library that allows your Python tests to travel through time by mocking the datetime module.

This one is perhaps less egregious but I find it irksome for how prevalent it is. The case where I saw it crop up most recently was in testing code like this:

def some_method(self):
    some_value = self.another_method()
    return SomeObject(current_time=datetime.datetime.now().isoformat(), some_value)

Testing this might seem to require the third party library full of exotic looking code (at a glance I saw metaclasses, stack frame inspection, class decorators). Such that a test looks like this:

class TestExample(unittest.TestCase):
    @freeze_time('2024-03-09T21:23:58.356803+00:00')
    def test_some_method(self):
        instance = SomeClass()
        result = instance.some_method()
        self.assertEqual(result, SomeObject(current_time='2024-03-09T21:23:58.356803+00:00', some_value))

But what are the alternatives? Sure we could try monkey-patching things ourselves or omitting aspects of "time" from our assertions. What if instead we did the completely boring thing and leveraged dependency injection and wrote another minimal fake object? This requires changing the application logic to accept as input an object but in my case the net effect is this:

def some_method(self):
    some_value = self.another_method()
    return SomeObject(current_time=self.clock.now().isoformat(), some_value)

Where self.clock could be this:

class Clock:
    def now(self):
        return datetime.datetime.now()

But in testing suddenly we are free to create more useful objects without anything particularly exotic:

class TestClock:
    def __init__(self, fixed_time: datetime.datetime):
        self.fixed_time = fixed_time

    def now(self):
        return self.fixed_time

The test code might instead look like this:

class TestExample(unittest.TestCase):
    def test_some_method(self):
        fixed_time = datetime.datetime.fromisoformat('2024-03-09T21:23:58.356803+00:00')
        instance = SomeClass(clock=TestClock(fixed_time))
        result = instance.some_method()
        self.assertEqual(result, SomeObject(current_time=fixed_time.isoformat(), some_value))

There are cases where existing code or external libraries are not written in a way that supports this style. Rather than being a good argument for pulling in more dependencies I think it is an argument for wrapping such code in a more testable harness. I haven't got a snappy name for why this sort of thing seems so prevalent, it feels a bit like primitive obsession from Refactoring in slightly different clothes. It is easier to start to use all the inner machinations of a full suite of objects rather than defining a few specific ones tailored to your own needs. It is probably the right thing to do when building one-offs or prototyping! My problems set in when all those leaky abstractions pile up in five or ten years (or sooner!).