Functions and Closures

  • When we define a function, our programming system saves instructions for later use.
  • Since functions are just data, we can separate creation from naming.
  • Most programming languages use eager evaluation, in which arguments are evaluated before a function is called.
  • Programming languages can also use lazy evaluation, in which expressions are passed to functions for just-in-time evaluation.
  • Every call to a function creates a new stack frame on the call stack.
  • When a function looks up variables it checks its own stack frame and the global frame.
  • A closure stores the variables referenced in a particular scope.

Terms defined: anonymous function, call stack, closure, dynamic scoping, eager evaluation, extensibility, lambda expression, lazy evaluation, lexical scoping, name collision, stack frame, variable capture

One way to evaluate the design of a piece of software is to ask how extensible it is, i.e., how easily we can add or change things [Wilson2022b]. The answer for the interpreter of Chapter 7 is “pretty easily” but the answer for the little language it interprets is “not at all”, because users cannot define new operations in the little language itself. We need to give them a way to define and call functions. Doing this will take less than 60 lines of code, and once we understand how definition works, we will be able to understand how an advanced feature of most modern programming languages works as well.

Definition and Storage

Let’s start by defining a function that takes a single parameter and immediately returns it. In Python, this is:

def same(num):
    return num

It has a name, a (possibly empty) list of parameters, and a body, which in this case is a single statement.

Our little language does things differently. Since a function is just another kind of object, we can define it on its own without naming it:

["func", ["num"], ["get", "num"]]

To save the function for later use, we simply assign it to a name as we would assign any other value:

["set", "same", ["func", ["num"], ["get", "num"]]]

Anonymity

A function without a name is called an anonymous function. JavaScript makes heavy use of anonymous functions; Python supports a very limited version of them using lambda expressions:

double = lambda x: 2 * x
double(3)

Calling Functions

In Python, we would call this function as same(3). Our little language requires us to specify an operator explicitly, so we write the call as:

["call", "same", 3]

To make "call" work the way most programmers expect, we need to implement scope so that the parameters and variables used in a function aren’t confused with those defined outside it. In other words, we need to prevent name collision. When a function is called with one or more expressions as arguments, we will:

  1. Evaluate all of these expressions.

  2. Look up the function.

  3. Create a new environment from the function’s parameter names and the expressions’ values.

  4. Call do to run the function’s action and capture the result.

  5. Discard the environment created in Step 3.

  6. Return the function’s result.

Eager and Lazy

Evaluating a function’s arguments before we run it is called eager evaluation. We could instead use lazy evaluation, in which case we would pass the argument sub-lists into the function and let it evaluate them when it needed their values. Python and most other languages are eager, but a handful of languages, such as R, are lazy. It’s a bit more work, but it allows the function to inspect the expressions it has been called with and to decide how to handle them.

To make this work, the environment must be a list of dictionaries instead of a single dictionary. This list is the call stack of our program, and each dictionary in it is usually called a stack frame. When a function wants the value associated with a name, we look through the list from the most recent dictionary to the oldest.

Scoping Rules

Searching through all active stack frames for a variable is called dynamic scoping. In contrast, most programming languages used lexical scoping, which figures out what a variable name refers to based on the structure of the program text. The former is easier to implement (which is why we’ve chosen it); the latter is easier to understand, particularly in large programs. [Nystrom2021] has an excellent step-by-step explanation of how to build lexical scoping.

The completed implementation of function definition is:

def do_func(env, args):
    assert len(args) == 2
    params = args[0]
    body = args[1]
    return ["func", params, body]

and the completed implementation of function call is:

def do_call(env, args):
    # Set up the call.
    assert len(args) >= 1
    name = args[0]
    values = [do(env, a) for a in args[1:]]

    # Find the function.
    func = env_get(env, name)
    assert isinstance(func, list) and (func[0] == "func")
    params, body = func[1], func[2]
    assert len(values) == len(params)

    # Run in new environment.
    env.append(dict(zip(params, values)))
    result = do(env, body)
    env.pop()

    # Report.
    return result

and our test program and its output are:

["seq",
  ["set", "double",
    ["func", ["num"],
      ["add", ["get", "num"], ["get", "num"]]
    ]
  ],
  ["set", "a", 1],
  ["repeat", 4, ["seq",
    ["set", "a", ["call", "double", ["get", "a"]]],
    ["print", ["get", "a"]]
  ]]
]
2
4
8
16
=> None

Unpacking One Line

do_call contains the line:

env.append(dict(zip(params, values)))

Working from the inside out, it uses the built-in function zip to create a list of pairs of corresponding items from params and values, then passes that list of pairs to dict to create a dictionary, which it then appends to the list env. The exercises will explore whether rewriting this would make it easier to read.

Once again, Python and other languages do more or less what we’ve done here. When we define a function, the interpreter saves the instructions in a lookup table. When we call a function at runtime, the interpreter finds the function in the table, creates a new stack frame, executes the instructions in the function, and pops the frame off the stack.

Closures

We normally define functions at the top level of our program, but Python and most other modern languages allow us to define functions within functions. Those inner functions have access to the variables defined in the enclosing function, just as the functions we’ve seen in earlier examples have access to things defined at the global level of the program:

def outer(value):
    def inner(current):
        print(f"inner sum is {current + value}")

    print(f"outer value is {value}")
    for i in range(3):
        inner(i)

outer(10)
outer value is 10
inner sum is 10
inner sum is 11
inner sum is 12

But since functions are just another kind of data, the outer function can return the inner function it defined as its result:

def make_hidden(thing):
    def _inner():
        return thing
    return _inner

has_secret = make_hidden(1 + 2)
print("hidden thing is", has_secret())
hidden thing is 3

The inner function still has access to the value of thing, but nothing else in the program does. A computer scientist would say that the inner function captures the variables in the enclosing function to create a closure (Figure 8.1). Doing this is a way to make data private: once make_hidden returns _inner and we assign it to has_secret in the example above, nothing else in our program has any way to access the value that was passed to make_hidden as thing.

Closures
Figure 8.1: Closures.

One common use of closures is to turn a function that needs many arguments into one that needs fewer, i.e., to create a function now that remembers some values it’s supposed to use later; we will explore this in Chapter 9. Closures are also another way to implement objects. Instead of building a dictionary ourselves as we did in Chapter 2, we use the one that Python creates behind the scenes to implement a closure. In the code below, for example, the function make_object creates a dictionary containing two functions:

def make_object(initial_value):
    private = {"value": initial_value}

    def getter():
        return private["value"]

    def setter(new_value):
        private["value"] = new_value

    return {"get": getter, "set": setter}

object = make_object(00)
print("initial value", object["get"]())
object["set"](99)
print("object now contains", object["get"]())
initial value 0
object now contains 99

When this code runs, Python creates a closure that is shared by the two functions (Figure 8.2). The closure has a key "private"; there is nothing special about this name, but nothing in the program can see the data in the closure except the two functions. We could add more keys to this dictionary to create more complex objects and build an entire system of objects and classes this way.

Objects as closures
Figure 8.2: Implementing objects using closures.

Summary

Figure 8.3 summarizes the ideas in this chapter, which is one of the most technically challenging in this book. In particular, don’t be surprised if it takes several passes to understand closures: they are as subtle as they are useful.

Concept map of functions and closures
Figure 8.3: Concept map.

Exercises

Rewriting Environment Creation

Re-read the description of how this line in do_call works:

env.append(dict(zip(params, values)))

and then rewrite the line using a loop to insert parameter names and values into a dictionary. Do you find your rewritten code easier to read?

Chained Maps

Look at the documentation for the ChainMap class and modify the interpreter to use that to manage environments.

Defining Named Functions

Modify do_func so that if it is given three arguments instead of two, it uses the first one as the function’s name without requiring a separate "set" instruction.

Evaluating Parameters

do_func stores the new function’s parameters and body without evaluating them. What would happen if it did evaluate them immediately?

Implicit Sequence

  1. Modify do_func so that if it is given more than one argument, it uses all but the first as the body of the function (i.e., treats everything after the parameter list as an implicit "seq").

  2. Is there a way to make this work in combination with naming-at-creation from the previous exercise?

Preventing Redefinition

  1. Modify the interpreter so that programs cannot redefine functions, i.e., so that once a function has been assigned to a variable, that variable’s value cannot be changed.

  2. Why might this be a good idea? What does it make more difficult?

Generalizing Closure-Based Objects

Modify the getter/setter example so that:

  1. make_object accepts any number of named parameters and copies them into the private dictionary.

  2. getter takes a name as an argument and returns the corresponding value from the dictionary.

  3. setter takes a name and a new value as arguments and updates the dictionary.

What does your implementation of getter do if the name isn’t already in the private dictionary? What does your setter do if the name isn’t already there? What does it do if the update value has a different type than the current value?

What Can Change?

Explain why this program doesn’t work:

def make_counter():
    value = 0
    def _inner():
        value += 1
        return value
    return _inner

c = make_counter()
for i in range(3):
    print(c())

Explain why this one does:

def make_counter():
    value = [0]
    def _inner():
        value[0] += 1
        return value[0]
    return _inner

c = make_counter()
for i in range(3):
    print(c())

How Private Are Closures?

If the data in a closure is private, explain why lines 1 and 2 are the same in the output of this program but lines 3 and 4 are different.

def wrap(extra):
    def _inner(f):
        return [f(x) for x in extra]
    return _inner

odds = [1, 3, 5]
first = wrap(odds)
print("1.", first(lambda x: 2 * x))

odds = [7, 9, 11]
print("2.", first(lambda x: 2 * x))

evens = [2, 4, 6]
second = wrap(evens)
print("3.", second(lambda x: 2 * x))

evens.append(8)
print("4.", second(lambda x: 2 * x))
1. [2, 6, 10]
2. [2, 6, 10]
3. [4, 8, 12]
4. [4, 8, 12, 16]