CS 11 Python Track: Assignment 5: Decorators


Goals

In this assignment you will learn the basics of decorators, a Python feature that has a number of uses.


Covered this week


Template files to download


Background

Decorators are a way to modify a function at the moment the function is defined. Simply put, a decorator is a function that you apply to a second function when the second function is defined. In the following example, foo is a decorator for bar.

    @foo   # <-- decorator syntax
    def bar(x):
        return x * 2

Decorators are higher-order functions. A higher-order function is one that takes a function (or functions) as its arguments and/or returns a function as its return value. Since functions are objects in python, this poses no problem for regular functions. For instance, if foo was a function that took one argument (also a function) and returned a function as its return value, we could apply it to the bar function:

    def bar(x):
        return x * 2

    def foo(func):
        # ... transform function "func" to "func2" ...
        return func2  # <-- return a function!

Then we could reassign the result of foo(bar) to bar as follows:

    bar = foo(bar)

Now the function bar has been modified by the decorator foo. The new definition of bar overwrites the old definition, so that now when we call bar, we get the decorated version.

There are lots of useful ways we can modify functions. For instance, we could change it so that a function of one argument prints out its argument value before returning:

   def foo(func):
      def func2(x):
          print('ARGUMENT:', x)
          return func(x)
      return func2
   
   def bar(x):
       return x * 2

   bar = foo(bar)

Now look what happens when we call bar:

>>> bar(10)
ARGUMENT: 10
20

However, the syntax:

   bar = foo(bar)

is quite ugly, so Python provides a nice alternative:

   def foo(func):
      def func2(x):
          print('ARGUMENT:', x)
          return func(x)
      return func2
   
   @foo   # <-- decorator syntax
   def bar(x):
       return x * 2

This code:

   @foo
   def bar(x):
       return x * 2

is identical to this:

   def bar(x):
       return x * 2
   bar = foo(bar)

but more pleasant to read. If one writes decorators correctly, they can make the process of modifying functions much cleaner.

There are many uses of decorators in Python. A number of useful decorators are predefined in the language, and Python programmers often create new ones. We'll be using both built-in and newly-created decorators in this assignment. Decorators can also act on methods of classes because methods are essentially a special kind of function. For instance, there are decorators to specify that a method of a class is static (independent of instances); this is the @staticmethod decorator. There is also a somewhat similar decorator for class methods called (you guessed it!) @classmethod. If you're curious, look these up in the Python documentation.


Instructions

lab5a: Built-in decorators

Suppose we have the following class to track students in a high school for the purposes of class rank. We want to check the ordering of students and we can assume that the names are all different. (In real projects, this object would be more complex, with additional methods and attributes e.g. an id_number.)

    class Student(object):
        def __init__(self, name, gpa):
            self.name = name
            self.gpa = gpa

        def __eq__(self, other):
            ''' Compares the student GPAs '''
            return self.gpa == other.gpa

        def __lt__(self,other):
            return self.gpa < other.gpa

Further, suppose we had

    stu1 = Student('Bob Smith', 2.5)
    stu2 = Student('Mary Jane', 3.9)

[5] It would be easy to check for equality (stu1 == stu2) or whether stu1 < stu2 because of the __eq__ and __lt__ special method definitions.. We can even change the order if we want stu1 > stu2. But what if we wanted to write stu1 <= stu2? What would we have to implement? Write your answer in a comment.

[10] Instead of writing all this boilerplate code, we can automatically implement every comparison function using a functools decorator. Use the functools documentation to find it. Write a new copy of Student with this decorator applied.

lab5b: Timing

[15] Now you are going to write your very own decorator. Write a decorator called start_stop that prints "start" before the function executes and "stop" after the function executes. Remember that start_stop, like all decorators, is a higher-order function that takes in a function and outputs a function. Hint: You might want to define a function inside start_stop so that you don't have to deal with lambdas. Note that you can test start_stop on the provided sample code.

    @start_stop
    def foo():
        return 'foo'

    @start_stop
    def bar(x, y):
        print(x, y)
        return x + y

    # prints "start", then "foo", then "stop":
    foo()
    # prints "start", then "5 6", then "stop":
    bar(5, 6)
    # also prints "11" (the return value) if invoked interactively

Note that start_stop takes one argument: the function being defined. This is similar to self when dealing with methods – the first argument is implicit.

One important thing about start_stop is that it should work on any function, no matter how many arguments the function takes, and even if the function takes keyword arguments. Python has special syntax for functions that can take arbitrary numbers of regular and keyword arguments that you probably want to review. It also has special syntax for applying functions to argument lists and keyword dictionaries. (Python 2 used the apply() function, but this doesn't exist in Python 3; look up what has taken its place in the Python documentation.)

[10] Now, you will write a decorator called timer_wrapper that outputs the amount of time a function takes to run. Use the timeit library to get easy access to the default_timer function. The code you wrote for start_stop should be a good place to get started. You can test your code using these functions:

      @timer_wrapper
      def foo():
          return 'foo'

      @timer_wrapper
      def bar(x, y):
          print(x, y)
          return x + y

      @timer_wrapper
      @start_stop
      def foobar(lst):
          time.sleep(2)
          return lst[0]

Note that decorators can be chained, as is done in the last example. It is equivalent to this code:

    foobar = timer_wrapper(start_top(foobar))

[10] Decorators can also take arguments. However, this causes problems for the function input. For example, let's say you wanted to write a variant of the start_stop decorator that could (optionally) take some keyword arguments, of which one would be printed instead of 'start' and the other instead of 'stop'. You could write it so that it could be invoked as follows:

    baz = start_stop2(baz, start='Begin', stop='End')

and this will modify baz in the right way, but it isn't very pleasant to use because we can't use the @start_stop2 decorator syntax. (The decorator syntax only works if the function takes a single argument, which is the function to be decorated. So what do we do?

Well, if you've taken CS 4 you probably already know. There is a spiffy technique from functional programming called currying which will give us what we need. The idea is to break up the function into pieces, so that the start_stop function takes only the extra arguments (start='Begin' and stop='End' above) and returns a function that takes the remaining argument (the function to be decorated). With this, we can use start_stop2 along with the arguments as a decorator, as in these examples:

    @start_stop2()  # use defaults for start and stop
    def foo():
        return 'foo'

    @start_stop2(start='Begin')  # use 'Begin' for start, default for stop
    def bar(x, y):
        print(x, y)
        return x + y

Note that the parentheses in the decorator are needed now because of the optional arguments.

lab5c: Memoization

Memoization is the process by which we have a function store some of its input with its associated output, so that if the function receives the same input again, it can just return the output directly instead of having to recompute it from scratch. This can cause dramatic speedups, especially if the same values need to be repeated many times. We will experiment with the Fibonacci sequence, which will clearly show how useful memoization is.

[5] Write fib(n) a function which takes n and outputs the nth number of the Fibonacci sequence. Note that the 0th and 1st Fibonacci numbers are 0 and 1. To get a successive Fibonacci number, add together the two previous ones. Write the naive recursive version here. There's no need to get fancy. This should take four lines. Don't use a for or a while loop in your code!

[5] Apply your timing decorator to the fib function. Try it with an argument of 35 or more. What happens? What causes the inefficiency? Hint: Try using the start_stop decorator instead to see what's going on.

[10] The start_stop decorator may not be giving you enough information. Write a trace decorator that tells you when you are entering a function and exiting it (like start_stop) but also prints out the argument value in both cases. For simplicity, assume that the function that trace is decorating takes only one argument (like fib does). Use it with the inefficient fib function. Running it should make it obvious why the fib function is so inefficient. Write down your explanation in a comment.

[5] Part of the inefficiency of this fibonacci function is that it requires us to compute fib(0) (and other fib values) multiple times when computing larger fib values. If we used memoization, we could speed this up. Memoization is the technique of saving the already computed answers to problems so we don't need to compute them again. Fortunately, python has a builtin memoization decorator. The decorator functools.lru_cache(None) will memoize a function with an infinite-sized memoization cache. Using this decorator, compute fib(35). Is it noticeably faster? Try applying your timing decorator. Try running it multiple times. See what happens.

[15] In order to get a better grasp on how these concepts work, you are going to implement your own version of lru_cache. First, implement memoize which will act the same way as functools.lru_cache(None). Use this and the timer code to test how much of an improvement you made. It should be comparable to the built-in version.

[20] Now, implement a least recently used cache. We can build a simple LRU cache of a particular maximum size by using a queue structure. The queue will hold (input, output) tuples, and the order in the queue will determine which one was least recently used. You can use a list as your queue data structure. Run fibonacci with a fairly small cache size, say 5. How does the speed compare to the memoized and un-memoized versions? Why would you want to limit the cache size?


To hand in

The lab5a.py, lab5b.py, and lab5c.py.


Copyright (c) 2017, California Institute of Technology. All rights reserved.