Saturday, May 28, 2022
No menu items!
HomeArtificial Intelligence and Machine LearningA Gentle Introduction to Decorators in Python

A Gentle Introduction to Decorators in Python



When working on code, whether we know it or not, we often come across the decorator design pattern. This is a programming technique to extend the functionality of classes or functions without modifying them. The decorator design pattern allows us to mix and match extensions easily. Python has a decorator syntax rooted in the decorator design pattern. Knowing how to make and use a decorator can help you write more powerful code.

In this post, you will discover the decorator pattern and Python’s function decorators.

After completing this tutorial, you will learn:

What is the decorator pattern, and why is it useful
Python’s function decorators and how to use them

Let’s get started!

A Gentle Introduction to Decorators in Python
Photo by Olya Kobruseva. Some rights reserved.

Overview

This tutorial is divided into four parts:

What is the decorator pattern, and why is it useful?
Function decorators in Python
The use cases of decorators
Some practical examples of decorators

What is the decorator pattern, and why is it useful?

The decorator pattern is a software design pattern that allows us to dynamically add functionality to classes without creating subclasses and affecting the behavior of other objects of the same class. By using the decorator pattern, we can easily generate different permutations of functionality that we might want without creating an exponentially increasing number of subclasses, making our code increasingly complex and bloated.

Decorators are usually implemented as sub-interfaces of the main interface that we want to implement and store an object of the main interface’s type. It will then modify the methods to which it wants to add certain functionality by overriding the methods in the original interface and calling on methods from the stored object.

UML class diagram for decorator pattern

Above is the UML class diagram for the decorator design pattern. The decorator abstract class contains an object of type OriginalInterface; this is the object whose functionality the decorator will be modifying. To instantiate our concrete DecoratorClass, we would need to pass in a concrete class that implements the OriginalInterface, and then when we make method calls to DecoratorClass.method1(), our DecoratorClass should modify the output from the object’s method1().

With Python, however, we are able to simplify many of these design patterns due to dynamic typing along with functions and classes being first-class objects. While modifying a class or a function without changing the implementation remained the key idea of decorators, we will explore Python’s decorator syntax in the following.

Function Decorators in Python

A function decorator is an incredibly useful feature in Python. It is built upon the idea that functions and classes are first-class objects in Python.

Let’s consider a simple example, that is, to call a function twice. Since a Python function is an object and we can pass a function as an argument to another function, this task can be done as follows:

def repeat(fn):
fn()
fn()

def hello_world():
print(“Hello world!”)

repeat(hello_world)

Again, since a Python function is an object, we can make a function to return another function, which is to execute yet another function twice. This is done as follows:

def repeat_decorator(fn):
def decorated_fn():
fn()
fn()
# returns a function
return decorated_fn

def hello_world():
print (“Hello world!”)

hello_world_twice = repeat_decorator(hello_world)

# call the function
hello_world_twice()

The function returned by repeat_decorator() above is created when it is invoked, as it depends on the argument provided. In the above, we passed the hello_world function as an argument to the repeat_decorator() function, and it returns the decorated_fn function, which is assigned to hello_world_twice. Afterward, we can invoke hello_world_twice() since it is now a function.

The idea of decorator pattern applies here. But we do not need to define the interface and subclasses explicitly. In fact, hello_world is a name defined as a function in the above example. There is nothing preventing us from redefining this name to something else. Hence we can also do the following:

def repeat_decorator(fn):
def decorated_fn():
fn()
fn()
# returns a function
return decorated_fn

def hello_world():
print (“Hello world!”)

hello_world = repeat_decorator(hello_world)

# call the function
hello_world()

That is, instead of assigning the newly created function to hello_world_twice, we overwrite hello_world instead. While the name hello_world is reassigned to another function, the previous function still exists but is just not exposed to us.

Indeed, the above code is functionally equivalent to the following:

# function decorator that calls the function twice
def repeat_decorator(fn):
def decorated_fn():
fn()
fn()
# returns a function
return decorated_fn

# using the decorator on hello_world function
@repeat_decorator
def hello_world():
print (“Hello world!”)

# call the function
hello_world()

In the above code, @repeat_decorator before a function definition means to pass the function into repeat_decorator() and reassign its name to the output. That is, to mean hello_world = repeat_decorator(hello_world). The @ line is the decorator syntax in Python.

Note: @ syntax is also used in Java but has a different meaning where it’s an annotation that is basically metadata and not a decorator.

We can also implement decorators that take in arguments, but this would be a bit more complicated as we need to have one more layer of nesting. If we extend our example above to define the number of times to repeat the function call:

def repeat_decorator(num_repeats = 2):
# repeat_decorator should return a function that’s a decorator
def inner_decorator(fn):
def decorated_fn():
for i in range(num_repeats):
fn()
# return the new function
return decorated_fn
# return the decorator that actually takes the function in as the input
return inner_decorator

# use the decorator with num_repeats argument set as 5 to repeat the function call 5 times
@repeat_decorator(5)
def hello_world():
print(“Hello world!”)

# call the function
hello_world()

The repeat_decorator() takes in an argument and returns a function which is the actual decorator for the hello_world function (i.e., invoking repeat_decorator(5) returns inner_decorator with the local variable num_repeats = 5 set). The above code will print the following:

Hello world!
Hello world!
Hello world!
Hello world!
Hello world!

Before we end this section, we should remember that decorators can also be applied to classes in addition to functions. Since class in Python is also an object, we may redefine a class in a similar fashion.

The Use Cases of Decorators

The decorator syntax in Python made the use of decorators easier. There are many reasons we may use a decorator. One of the most common use cases is to convert data implicitly. For example, we may define a function that assumes all operations are based on numpy arrays and then make a decorator to ensure that happens by modifying the input:

# function decorator to ensure numpy input
def ensure_numpy(fn):
def decorated_function(data):
# converts input to numpy array
array = np.asarray(data)
# calls fn on input numpy array
return fn(array)
return decorated_function

We can further add to our decorator by modifying the output of the function, such as rounding off floating point values:

# function decorator to ensure numpy input
# and round off output to 4 decimal places
def ensure_numpy(fn):
def decorated_function(data):
array = np.asarray(data)
output = fn(array)
return np.around(output, 4)
return decorated_function

Let’s consider the example of finding the sum of an array. A numpy array has sum() built-in, as does pandas DataFrame. But the latter is to sum over columns rather than sum over all elements. Hence a numpy array will sum to one floating point value while a DataFrame will sum to a vector of values. But with the above decorator, we can write a function that gives you consistent output in both cases:

import numpy as np
import pandas as pd

# function decorator to ensure numpy input
# and round off output to 4 decimal places
def ensure_numpy(fn):
def decorated_function(data):
array = np.asarray(data)
output = fn(array)
return np.around(output, 4)
return decorated_function

@ensure_numpy
def numpysum(array):
return array.sum()

x = np.random.randn(10,3)
y = pd.DataFrame(x, columns=[“A”, “B”, “C”])

# output of numpy .sum() function
print(“x.sum():”, x.sum())
print()

# output of pandas .sum() funuction
print(“y.sum():”, y.sum())
print(y.sum())
print()

# calling decorated numpysum function
print(“numpysum(x):”, numpysum(x))
print(“numpysum(y):”, numpysum(y))

Running the above code gives us the output:

x.sum(): 0.3948331694737762

y.sum(): A -1.175484
B 2.496056
C -0.925739
dtype: float64
A -1.175484
B 2.496056
C -0.925739
dtype: float64

numpysum(x): 0.3948
numpysum(y): 0.3948

This is a simple example. But imagine if we define a new function that computes the standard deviation of elements in an array. We can simply use the same decorator, and then the function will also accept pandas DataFrame. Hence all the code to polish input is taken out of these functions by depositing them into the decorator. This is how we can efficiently reuse the code.

Some Practical Examples of Decorators

Now that we learned the decorator syntax in Python, let’s see what we can do with it!

Memoization

There are some function calls that we do repeatedly, but where the values rarely, if ever, change. This could be calls to a server where the data is relatively static or as part of a dynamic programming algorithm or computationally intensive math function. We might want to memoize these function calls, i.e., storing the value of their output on a virtual memo pad for reuse later.

A decorator is the best way to implement a memoization function. We just need to remember the input and output of a function but keep the function’s behavior as-is. Below is an example:

import pickle
import hashlib

MEMO = {} # To remember the function input and output

def memoize(fn):
def _deco(*args, **kwargs):
# pickle the function arguments and obtain hash as the store keys
key = (fn.__name__, hashlib.md5(pickle.dumps((args, kwargs), 4)).hexdigest())
# check if the key exists
if key in MEMO:
ret = pickle.loads(MEMO[key])
else:
ret = fn(*args, **kwargs)
MEMO[key] = pickle.dumps(ret)
return ret
return _deco

@memoize
def fibonacci(n):
if n in [0, 1]:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(40))
print(MEMO)

In this example, we implemented memoize() to work with a global dictionary MEMO such that the name of a function together with the arguments becomes the key and the function’s return becomes the value. When the function is called, the decorator will check if the corresponding key exists in MEMO, and the stored value will be returned. Otherwise, the actual function is invoked, and its return value is added to the dictionary.

We use pickle to serialize the input and output and use hashlib to create a hash of the input because not everything can be a key to the Python dictionary (e.g., list is an unhashable type; thus, it cannot be a key). Serializing any arbitrary structure into a string can overcome this and guarantee that the return data is immutable. Furthermore, hashing the function argument would avoid storing an exceptionally long key in the dictionary (for example, when we pass in a huge numpy array to the function).

The above example uses fibonacci() to demonstrate the power of memoization. Calling fibonacci(n) will produce the n-th Fibonacci number. Running the above example would produce the following output, in which we can see the 40th Fibonacci number is 102334155 and how the dictionary MEMO is used to store different calls to the function.

102334155
{(‘fibonacci’, ‘635f1664f168e2a15b8e43f20d45154b’): b’x80x04Kx01.’,
(‘fibonacci’, ‘d238998870ae18a399d03477dad0c0a8′): b’x80x04Kx00.’,
(‘fibonacci’, ‘dbed6abf8fcf4beec7fc97f3170de3cc’): b’x80x04Kx01.’,

(‘fibonacci’, ‘b9954ff996a4cd0e36fffb09f982b08e’): b’x80x04x95x06x00x00x00x00x00x00x00J)pTx02.’,
(‘fibonacci’, ‘8c7aba62def8063cf5afe85f42372f0d’): b’x80x04x95x06x00x00x00x00x00x00x00Jxa2x0exc5x03.’,
(‘fibonacci’, ‘6de8535f23d756de26959b4d6e1f66f6′): b’x80x04x95x06x00x00x00x00x00x00x00Jxcb~x19x06.’}

You may try to remove the @memoize line in the code above. You will find the program takes significantly longer to run (because each function call invokes two more function calls; hence it is running in O(2^n) instead of O(n) as in the memoized case), or you may even be running out of memory.

Memoization is very helpful for expensive functions whose outputs don’t change frequently, for example, the following function that reads some stock market data from the Internet:

import pandas_datareader as pdr

@memoize
def get_stock_data(ticker):
# pull data from stooq
df = pdr.stooq.StooqDailyReader(symbols=ticker, start=”1/1/00″, end=”31/12/21″).read()
return df

#testing call to function
import cProfile as profile
import pstats

for i in range(1, 3):
print(f”Run {i}”)
run_profile = profile.Profile()
run_profile.enable()
get_stock_data(“^DJI”)
run_profile.disable()
pstats.Stats(run_profile).print_stats(0)

If implemented correctly, the call to get_stock_data() should be more expensive the first time and much less expensive subsequently. The output from the code snippet above gives us:

Run 1
17492 function calls (17051 primitive calls) in 1.452 seconds

Run 2
221 function calls (218 primitive calls) in 0.001 seconds

This is particularly useful if you are working on a Jupyter notebook. If you need to download some data, wrap it in a memoize decorator. Since developing a machine learning project means many iterations of changing your code to see if the result looks any better, a memoized download function saves you a lot of unnecessary waiting.

You may make a more powerful memoization decorator by saving the data in a database (e.g., a key-value store like GNU dbm or an in-memory database such as memcached or Redis). But if you just need the functionality as above, Python 3.2 or later shipped you the decorator lru_cache from the built-in library functools, so you don’t need to write your own:

import functools

import pandas_datareader as pdr

# memoize using lru_cache
@functools.lru_cache
def get_stock_data(ticker):
# pull data from stooq
df = pdr.stooq.StooqDailyReader(symbols=ticker, start=”1/1/00″, end=”31/12/21″).read()
return df

# testing call to function
import cProfile as profile
import pstats

for i in range(1, 3):
print(f”Run {i}”)
run_profile = profile.Profile()
run_profile.enable()
get_stock_data(“^DJI”)
run_profile.disable()
pstats.Stats(run_profile).print_stats(0)

Note: The lru_cache implements LRU caching, which limits its size to the most recent calls (default 128) to the function. In Python 3.9, there is a @functools.cache as well, which is unlimited in size without the LRU purging.

Function Catalog

Another example where we might want to consider the use of function decorators is for registering functions in a catalog. It allows us to associate functions with a string and pass the strings as arguments for other functions. This is the start of making a system that will enable user-provided plug-ins. Let’s illustrate this with an example. Below is a decorator and a function activate() that we will use later. Let’s assume the following code is saved in the file activation.py:

# activation.py

ACTIVATION = {}

def register(name):
def decorator(fn):
# assign fn to “name” key in ACTIVATION
ACTIVATION[name] = fn
# return fn unmodified
return fn
return decorator

def activate(x, kind):
try:
fn = ACTIVATION[kind]
return fn(x)
except KeyError:
print(“Activation function %s undefined” % kind)

After defining the register decorator in the above code, we can now use it to register functions and associate strings with them. Let’s have the file funcs.py as such:

# funcs.py

from activation import register
import numpy as np

@register(“relu”)
def relu(x):
return np.where(x>0, x, 0)

@register(“sigmoid”)
def sigm(x):
return 1/(1+np.exp(-x))

@register(“tanh”)
def tanh(x):
return np.tanh(x)

We’ve registered the “relu,” “sigmoid,” and “tanh” functions to their respective strings by building this association in the ACTIVATION dictionary.

Now, let’s see how we can use our newly registered functions.

import numpy as np
from activation import activate

# create a random matrix
x = np.random.randn(5,3)
print(x)

# try ReLU activation on the matrix
relu_x = activate(x, “relu”)
print(relu_x)

# load the functions, and call ReLU activation again
import funcs
relu_x = activate(x, “relu”)
print(relu_x)

which gives us the output:

[[-0.81549502 -0.81352867 1.41539545]
[-0.28782853 -1.59323543 -0.19824959]
[ 0.06724466 -0.26622761 -0.41893662]
[ 0.47927331 -1.84055276 -0.23147207]
[-0.18005588 -1.20837815 -1.34768876]]
Activation function relu undefined
None
[[0. 0. 1.41539545]
[0. 0. 0. ]
[0.06724466 0. 0. ]
[0.47927331 0. 0. ]
[0. 0. 0. ]]

Observe that before we reached the import func line, the ReLU activation does not exist. Hence calling the function will have the error message print, and the result is None. Then after we run that import line, we are loading those functions defined just like a plug-in module. Then the same function call gave us the result we expected.

Note that we never invoked anything in the module func explicitly, and we didn’t modify anything in the call to activate(). Simply importing func caused those new functions to register and expanded the functionality of activate(). Using this technique allows us to develop a very large system while focusing on only one small part at a time without worrying about the interoperability of other parts. Without the registration decorators and function catalog, adding a new activation function would need modification to every function that uses activation.

If you’re familiar with Keras, you should resonate the above with the following syntax:

layer = keras.layers.Dense(128, activation=”relu”)

model.compile(loss=”sparse_categorical_crossentropy”,
optimizer=”adam”,
metrics=[“sparse_categorical_accuracy”])

Keras defined almost all components using a decorator of similar nature. Hence we can refer to building blocks by name. Without this mechanism, we have to use the following syntax all the time, which puts a burden on us to remember the location of a lot of components:

layer = keras.layers.Dense(128, activation=keras.activations.relu)

model.compile(loss=keras.losses.SparseCategoricalCrossentropy(),
optimizer=keras.optimizers.Adam(),
metrics=[keras.metrics.SparseCategoricalAccuracy()])

Further reading

This section provides more resources on the topic if you are looking to go deeper.

Articles

Decorator pattern
Python Language Reference, Section 8.7, Function definitions
PEP 318 – Decorators for Functions and Methods

Books

Fluent Python, 2nd edition, by Luciano Ramalho

APIs

functools module in Python standard library

Summary

In this post, you discovered the decorator design pattern and Python’s decorator syntax. You also saw some specific use cases of decorators that can help your Python program run faster or be easier to extend.

Specifically, you learned:

The idea of a decorator pattern and the decorator syntax in Python
How to implement a decorator in Python for use with the decorator syntax
The use of a decorator for adapting function input and output, for memoization, and for registering functions in a catalog



The post A Gentle Introduction to Decorators in Python appeared first on Machine Learning Mastery.

Read MoreMachine Learning Mastery

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments