Protocols

Temporarily replacing functions with mock objects can simplify testing.
Mock objects can record their calls and/or return variable results.
Python defines protocols so that code can be triggered by keywords in the language.
Use the context manager protocol to ensure cleanup operations always execute.
Use decorators to wrap functions after defining them.
Use closures to create decorators that take extra parameters.
Use the iterator protocol to make objects work with for loops.

Terms defined: append mode, context manager, decorator, infinite recursion, iterator, Iterator pattern, mock object, protocol

This book is supposed to teach software design by implementing small versions of real-world tools, but we have reached a point where we need to learn a little more about Python itself in order to proceed. Our discussion of closures in Chapter 8 was the first step; in this chapter, we will look at how Python allows users to tell it to do things at specific moments.

Mock Objects

We have already seen that functions are objects referred to by variable names just like other values. We can use this fact to change functions at runtime to make testing easier. For example, if the function we want to test uses the time of day, we can temporarily replace the real time.time function with one that returns a specific value so we know what result to expect in our test:

import time

def elapsed(since):
    return time.time() - since

def mock_time():
    return 200

def test_elapsed():
    time.time = mock_time
    assert elapsed(50) == 150

Temporary replacements like this are called mock objects because we usually use objects even if the thing we’re replacing is a function. We can do this because Python lets us create objects that can be “called” just like functions. If an object obj has a __call__ method, then obj(…) is automatically turned into obj.__call__(…) just as a == b is automatically turned into a.__eq__(b) (Chapter 5). For example, the code below defines a class Adder whose instances add a constant to their input:

class Adder:
    def __init__(self, value):
        self.value = value

    def __call__(self, arg):
        return arg + self.value

add_3 = Adder(3)
result = add_3(8)
print(f"add_3(8): {result}")

add_3(8): 11

Let’s create a reusable mock object class that:

defines a __call__ method so that instances can be called like functions;
declares the parameters of that method to be *args and **kwargs so that it can be called with any number of regular or keyword arguments;
stores those arguments so we can see how the replaced function was called; and
returns either a fixed value or a value produced by a user-defined function.

The class itself is only 11 lines long:

class Fake:
    def __init__(self, func=None, value=None):
        self.calls = []
        self.func = func
        self.value = value

    def __call__(self, *args, **kwargs):
        self.calls.append([args, kwargs])
        if self.func is not None:
            return self.func(*args, **kwargs)
        return self.value

For convenience, let’s also define a function that replaces some function we’ve already defined with an instance of our Fake class:

def fakeit(name, func=None, value=None):
    assert name in globals()
    fake = Fake(func, value)
    globals()[name] = fake
    return fake

To show how this works, we define a function that adds two numbers and write a test for it:

def adder(a, b):
    return a + b

def test_with_real_function():
    assert adder(2, 3) == 5

We then use fakeit to replace the real adder function with a mock object that always returns 99 (Figure 9.1):

def test_with_fixed_return_value():
    fakeit("adder", value=99)
    assert adder(2, 3) == 99

Another test proves that our Fake class records all of the calls:

def test_fake_records_calls():
    fake = fakeit("adder", value=99)
    assert adder(2, 3) == 99
    assert adder(3, 4) == 99
    assert adder.calls == [[(2, 3), {}], [(3, 4), {}]]

And finally, the user can provide a function to calculate a return value:

def test_fake_calculates_result():
    fakeit("adder", func=lambda left, right: 10 * left + right)
    assert adder(2, 3) == 23

Protocols

Mock objects are very useful, but the way we’re using them is going to cause strange errors. The problem is that each test replaces adder with a mock object that does something different. As a result, any test that doesn’t replace adder will use whatever mock object was last put in place rather than the original adder function.

We could tell users it’s their job to put everything back after each test, but people are forgetful. It would be better if Python did this automatically; luckily for us, it provides a protocol for exactly this purpose. A protocol is a rule that specifies how programs can tell Python to do specific things at specific moments. Giving a class a __call__ method is an example of this: when Python sees thing(…), it automatically checks if thing has that method. Defining an __init__ method for a class is another example: if a class has a method with that name, Python calls it automatically when constructing a new instance of that class.

What we want for managing mock objects is a context manager that replaces the real function with our mock at the start of a block of code and then puts the original back at the end. The protocol for this relies on two methods called __enter__ and __exit__. If the class is called C, then when Python executes a with block like this:

with C(…args…) as name:
    …do things…

it does the following (Figure 9.2):

Call C’s constructor to create an object that it associates with the code block.
Call that object’s __enter__ method and assign the result to the variable name.
Run the code inside the with block.
Call name.__exit__() when the block finishes.

A context manager — Figure 9.2: Operations performed by a context manager.

Here’s a mock object that inherits all the capabilities of Fake and adds the two methods needed by with:

class ContextFake(Fake):
    def __init__(self, name, func=None, value=None):
        super().__init__(func, value)
        self.name = name
        self.original = None

    def __enter__(self):
        assert self.name in globals()
        self.original = globals()[self.name]
        globals()[self.name] = self
        return self

    def __exit__(self, exc_type, exc_value, exc_traceback):
        globals()[self.name] = self.original

Notice that __enter__ doesn’t take any extra parameters: anything it needs must be provided via the object’s constructor. On the other hand, __exit__ will always be called with three values that tell it whether an exception occurred, and if so, what the exception was. This test shows that our context manager is doing what it’s supposed to:

def subber(a, b):
    return a - b

def check_no_lasting_effects():
    assert subber(2, 3) == -1
    with ContextFake("subber", value=1234) as fake:
        assert subber(2, 3) == 1234
        assert len(fake.calls) == 1
    assert subber(2, 3) == -1

Context managers can’t prevent people from making mistakes, but they make it easier for people to do the right thing. They are also an example of how programming languages often evolve: eventually, if enough people are doing something the same way in enough places, support for that way of doing things is added to the language.

Decorators

Python programs rely on several other protocols, each of which gives user-level code a way to interact with some aspect of the Python interpreter. One of the most widely used is called a decorator, which allows us to wrap one function with another.

In order to understand how decorators work, we must take another look at closures (Chapter 8). Suppose we want to create a function called logging that prints a message before and after each call to some other arbitrary function. We could try to do it like this:

def original(value):
    print(f"original: {value}")

def logging(value):
    print("before call")
    original(value)
    print("after call")

original = logging
original("example")

but when we try to call original we wind up in an infinite loop. The wrapped version of our function refers to original, but Python looks up the function associated with that name at the time of call, which means it finds our wrapper function instead of the original function (Figure 9.3). We can prevent this infinite recursion by creating a closure to capture the original function for later use:

def original(value):
    print(f"original: {value}")

def logging(func):
    def _inner(value):
        print("before call")
        func(value)
        print("after call")
    return _inner

original = logging(original)
original("example")

before call
original: example
after call

Infinite recursion with a wrapped function — Figure 9.3: Infinite recursion caused by careless use of a wrapped function.

Using a closure also gives us a way to pass extra arguments when we create the wrapped function:

def original(value):
    print(f"original: {value}")

def logging(func, label):
    def _inner(value):
        print(f"++ {label}")
        func(value)
        print(f"-- {label}")
    return _inner

original = logging(original, "call")
original("example")

++ call
original: example
-- call

Wrapping functions like this is so useful that Python has built-in support for doing it. We define the decorator function that does the wrapping as before, but then use @wrap to apply it rather than name = wrap(name):

def wrap(func):
    def _inner(*args):
        print("before call")
        func(*args)
        print("after call")
    return _inner

@wrap
def original(message):
    print(f"original: {message}")

original("example")

before call
original: example
after call

If we want to pass arguments at the time we apply the decorator, though, it seems like we’re stuck: a Python decorator must take exactly one argument, which must be the function we want to decorate. The solution is to define a function inside a function inside yet another function to create a closure that captures the arguments:

def wrap(label):                  # function returning a decorator
    def _decorate(func):          # the decorator Python will apply
        def _inner(*args):        # the wrapped function
            print(f"++ {label}")  # 'label' is visible because
            func(*args)           # …it's captured in the closure
            print(f"-- {label}")  # …of '_decorate'
        return _inner
    return _decorate

@wrap("wrapping")                 # call 'wrap' to get a decorator
def original(message):            # decorator applied here
    print(f"original: {message}")

original("example")

++ wrapping
original: example
-- wrapping

Decorators didn’t need to be this complicated. In order to define a method that takes \( N \) parameters in Python, we have to write a function of \( N+1 \) parameters, the first of which represents the object for which the method is being called. Python could have done the same thing with decorators, i.e., allowed people to define a function of \( N+1 \) parameters and have @ fill in the first automatically:

def decorator(func, label):
    def _inner(arg):
        print(f"entering {label}")
        func(arg)
    return _inner

@decorator("message")
def double(x):           # equivalent to
    return 2 * x         # double = decorator(double, "message")

But this isn’t the path Python took, and as a result, decorators are harder to learn and use than they could have been.

Iterators

As a last example of how protocols work, consider the for loop. The statement for thing in collection assigns items from collection to the variable thing one at a time. Python implements this using a two-part iterator protocol, which is a version of the Iterator design pattern:

If an object has an __iter__ method, that method is called to create an iterator object.
That iterator object must have a __next__ method, which must return a value each time it is called. When there are no more values to return, it must raise a StopIteration exception.

For example, suppose we have a class that stores a list of strings and we want to return the characters from the strings in order. (We will use a class like this to store lines of text in Chapter 23.) In our first attempt, each object is its own iterator, i.e., each object keeps track of what value to return next when looping:

class NaiveIterator:
    def __init__(self, text):
        self._text = text[:]

    def __iter__(self):
        self._row, self._col = 0, -1
        return self

    def __next__(self):
        self._advance()
        if self._row == len(self._text):
            raise StopIteration
        return self._text[self._row][self._col]

If we think of the text in terms of rows and columns, the advance method moves the column marker forward within the current row. When we reach the end of a row, we reset the column to 0 and advance the row index by one:

    def _advance(self):
        if self._row < len(self._text):
            self._col += 1
            if self._col == len(self._text[self._row]):
                self._row += 1
                self._col = 0

Our first test seems to work:

def gather(buffer):
    result = ""
    for char in buffer:
        result += char
    return result


def test_naive_buffer():
    buffer = NaiveIterator(["ab", "c"])
    assert gather(buffer) == "abc"

However, our iterator doesn’t work if the buffer contains an empty string:

def test_naive_buffer_empty_string():
    buffer = NaiveIterator(["a", ""])
    with pytest.raises(IndexError):
        assert gather(buffer) == "a"

It also fails when we use a nested loop:

def test_naive_buffer_nested_loop():
    buffer = NaiveIterator(["a", "b"])
    result = ""
    for outer in buffer:
        for inner in buffer:
            result += inner
    assert result == "abab"

We can fix the first problem with more careful bookkeeping—we leave that as an exercise—but fixing the second problem requires us to re-think our design. The problem is that we only have one pair of variables (the _row and _col attributes of the buffer) to store the current location, but two loops trying to use them. What we need to do instead is create a separate object for each loop to use:

class BetterIterator:
    def __init__(self, text):
        self._text = text[:]

    def __iter__(self):
        return BetterCursor(self._text)

Each cursor keeps track of the current location for a single loop using code identical to what we’ve already seen (including the same bug with empty strings):

class BetterCursor:
    def __init__(self, text):
        self._text = text
        self._row = 0
        self._col = -1

    def __next__(self):
        self._advance()
        if self._row == len(self._text):
            raise StopIteration
        return self._text[self._row][self._col]

With this change in place, our test of nested loops passes.

Summary

Figure 9.4 summarizes the ideas and tools introduced in this chapter.

Concept map of mocks, protocols, and iterators — Figure 9.4: Concept map.

Exercises

Testing Exceptions

Create a context manager that works like pytest.raises from the pytest module, i.e., that does nothing if an expected exception is raised within its scope but fails with an assertion error if that kind of exception is not raised.

Timing Blocks

Create a context manager called Timer that reports how long it has been since a block of code started running:

# your class goes here

with Timer() as start:
    # …do some lengthy operation…
    print(start.elapsed())  # time since the start of the block

Handling Empty Strings

Modify the iterator example so that it handles empty strings correctly, i.e., so that iterating over the list ["a", ""] produces ["a"].

An Even Better Cursor

Rewrite the BetterCursor class so that it initializes self._row to 0 and self._col to \( -1 \) and always calls self._advance() as the first action in self.__next__. (You will need to make a few other changes as well.) Do you think this implementation is simpler than the one presented in this chapter?

Logging to a File

Create a decorator that takes the name of a file as an extra parameter and appends a log message to that file each time a function is called. (Hint: open the file in append mode each time it is needed.)