Thursday, April 18, 2024
No menu items!

Python debugging tools



In all programming exercises, it is difficult to go far and deep without a handy debugger. The built-in debugger, pdb, in Python is a mature and capable one that can help us a lot if you know how to use it. In this tutorial we are going see what the pdb can do for you as well as some of its alternative.

In this tutorial you will learn:

What can a debugger do
How to control a debugger
The limitation of Python’s pdb and its alternatives

Let’s get started.

Python debugging tools
Photo by Thomas Park. Some rights reserved.

Tutorial Overview

This tutorial is in 4 parts, they are

The concept of running a debugger
Walk-through of using a debugger
Debugger in Visual Studio Code
Using GDB on a running Python program

The concept of running a debugger

The purpose of a debugger is to provide you a slow motion button to control the flow of a program. It also allow you to freeze the program at certain point of time and examine the state.

The simplest operation under a debugger is to step through the code. That is to run one line of code at a time and wait for your acknowledgment before proceeding into next. The reason we want to run the program in a stop-and-go fashion is to allow us to check the logic and value or verify the algorithm.

For a larger program, we may not want to step through the code from the beginning as it may take a long time before we reached the line that we are interested in. Therefore, debuggers also provide a breakpoint feature that will kick in when a specific line of code is reached. From that point onward, we can step through it line by line.

Walk-through of using a debugger

Let’s see how we can make use of a debugger with an example. The following is the Python code for showing particle swarm optimization in an animation:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation

def f(x,y):
“Objective function”
return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73)

# Compute and plot the function in 3D within [0,5]x[0,5]
x, y = np.array(np.meshgrid(np.linspace(0,5,100), np.linspace(0,5,100)))
z = f(x, y)

# Find the global minimum
x_min = x.ravel()[z.argmin()]
y_min = y.ravel()[z.argmin()]

# Hyper-parameter of the algorithm
c1 = c2 = 0.1
w = 0.8

# Create particles
n_particles = 20
np.random.seed(100)
X = np.random.rand(2, n_particles) * 5
V = np.random.randn(2, n_particles) * 0.1

# Initialize data
pbest = X
pbest_obj = f(X[0], X[1])
gbest = pbest[:, pbest_obj.argmin()]
gbest_obj = pbest_obj.min()

def update():
“Function to do one iteration of particle swarm optimization”
global V, X, pbest, pbest_obj, gbest, gbest_obj
# Update params
r1, r2 = np.random.rand(2)
V = w * V + c1*r1*(pbest – X) + c2*r2*(gbest.reshape(-1,1)-X)
X = X + V
obj = f(X[0], X[1])
pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)]
pbest_obj = np.array([pbest_obj, obj]).min(axis=0)
gbest = pbest[:, pbest_obj.argmin()]
gbest_obj = pbest_obj.min()

# Set up base figure: The contour map
fig, ax = plt.subplots(figsize=(8,6))
fig.set_tight_layout(True)
img = ax.imshow(z, extent=[0, 5, 0, 5], origin=’lower’, cmap=’viridis’, alpha=0.5)
fig.colorbar(img, ax=ax)
ax.plot([x_min], [y_min], marker=’x’, markersize=5, color=”white”)
contours = ax.contour(x, y, z, 10, colors=’black’, alpha=0.4)
ax.clabel(contours, inline=True, fontsize=8, fmt=”%.0f”)
pbest_plot = ax.scatter(pbest[0], pbest[1], marker=’o’, color=’black’, alpha=0.5)
p_plot = ax.scatter(X[0], X[1], marker=’o’, color=’blue’, alpha=0.5)
p_arrow = ax.quiver(X[0], X[1], V[0], V[1], color=’blue’, width=0.005, angles=’xy’, scale_units=’xy’, scale=1)
gbest_plot = plt.scatter([gbest[0]], [gbest[1]], marker=’*’, s=100, color=’black’, alpha=0.4)
ax.set_xlim([0,5])
ax.set_ylim([0,5])

def animate(i):
“Steps of PSO: algorithm update and show in plot”
title = ‘Iteration {:02d}’.format(i)
# Update params
update()
# Set picture
ax.set_title(title)
pbest_plot.set_offsets(pbest.T)
p_plot.set_offsets(X.T)
p_arrow.set_offsets(X.T)
p_arrow.set_UVC(V[0], V[1])
gbest_plot.set_offsets(gbest.reshape(1,-1))
return ax, pbest_plot, p_plot, p_arrow, gbest_plot

anim = FuncAnimation(fig, animate, frames=list(range(1,50)), interval=500, blit=False, repeat=True)
anim.save(“PSO.gif”, dpi=120, writer=”imagemagick”)

print(“PSO found best solution at f({})={}”.format(gbest, gbest_obj))
print(“Global optimal at f({})={}”.format([x_min,y_min], f(x_min,y_min)))

The particle swarm optimization is done by executing the update() function a number of times. Each time it runs, we are closer to the optimal solution to the objective function. We are using matplotlib’s FuncAnimation() function instead of a loop to run update(). So we can capture the position of the particles at each iteration.

Assume this program is saved as pso.py, to run this program in command line is simply to enter:

python pso.py

and the solution will be print out to the screen and the animation will be saved as PSO.gif. But if we want to run it with the Python debugger, we enter the following in command line:

python -m pdb pso.py

The -m pdb part is to load the pdb module and let the module to execute the file pso.py for you. When you run this command, you will be welcomed with the pdb prompt as follows:

> /Users/mlm/pso.py(1)<module>()
-> import numpy as np
(Pdb)

At the prompt, you can type in the debugger commands. To show the list of supported commands, we can use h. And to show the detail of the specific command (such as list), we can use h list:

> /Users/mlm/pso.py(1)<module>()
-> import numpy as np
(Pdb) h

Documented commands (type help <topic>):
========================================
EOF c d h list q rv undisplay
a cl debug help ll quit s unt
alias clear disable ignore longlist r source until
args commands display interact n restart step up
b condition down j next return tbreak w
break cont enable jump p retval u whatis
bt continue exit l pp run unalias where

Miscellaneous help topics:
==========================
exec pdb

(Pdb)

At the beginning of a debugger session, we start with the first line of the program. Normally a Python program would start with a few lines of import. We can use n to move to the next line, or s to step into a function:

> /Users/mlm/pso.py(1)<module>()
-> import numpy as np
(Pdb) n
> /Users/mlm/pso.py(2)<module>()
-> import matplotlib.pyplot as plt
(Pdb) n
> /Users/mlm/pso.py(3)<module>()
-> from matplotlib.animation import FuncAnimation
(Pdb) n
> /Users/mlm/pso.py(5)<module>()
-> def f(x,y):
(Pdb) n
> /Users/mlm/pso.py(10)<module>()
-> x, y = np.array(np.meshgrid(np.linspace(0,5,100), np.linspace(0,5,100)))
(Pdb) n
> /Users/mlm/pso.py(11)<module>()
-> z = f(x, y)
(Pdb) s
–Call–
> /Users/mlm/pso.py(5)f()
-> def f(x,y):
(Pdb) s
> /Users/mlm/pso.py(7)f()
-> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73)
(Pdb) s
–Return–
> /Users/mlm/pso.py(7)f()->array([[17.25… 7.46457344]])
-> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73)
(Pdb) s
> /Users/mlm/pso.py(14)<module>()
-> x_min = x.ravel()[z.argmin()]
(Pdb)

In pdb, the line of code will be printed before the prompt. Usually n command is what we would prefer as it executes that line of code and moves the flow at the same level without drill down deeper. When we are at a line that calls a function (such as line 11 of the above program, that runs z = f(x, y)) we can use s to step into the function. In the above example, we first step into f() function, then another step to execute the computation, and finally, collect the return value from the function to give it back to the line that invoked the function. We see there are multiple s command needed for a function as simple as one line because finding the function from the statement, calling the function, and return each takes one step. We can also see that in the body of the function, we called np.sin() like a function but the debugger’s s command does not go into it. It is because the np.sin() function is not implemented in Python but in C. The pdb does not support compiled code.

If the program is long, it is quite boring to use the n command many times to move to somewhere we are interested. We can use until command with a line number to let the debugger run the program until that line is reached:

> /Users/mlm/pso.py(1)<module>()
-> import numpy as np
(Pdb) until 11
> /Users/mlm/pso.py(11)<module>()
-> z = f(x, y)
(Pdb) s
–Call–
> /Users/mlm/pso.py(5)f()
-> def f(x,y):
(Pdb) s
> /Users/mlm/pso.py(7)f()
-> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73)
(Pdb) s
–Return–
> /Users/mlm/pso.py(7)f()->array([[17.25… 7.46457344]])
-> return (x-3.14)**2 + (y-2.72)**2 + np.sin(3*x+1.41) + np.sin(4*y-1.73)
(Pdb) s
> /Users/mlm/pso.py(14)<module>()
-> x_min = x.ravel()[z.argmin()]
(Pdb)

A command similar to until is return, which will execute the current function until the point that it is about to return. You can consider that as until with the line number equal to the last line of the current function. The until command is one-off, meaning it will bring you to that line only. If you want to stop at a particular line whenever it is being run, we can make a breakpoint on it. For example, if we are interested in how each iteration of the optimization algorithm moves the solution, we can set a breakpoint right after the update is applied:

> /Users/mlm/pso.py(1)<module>()
-> import numpy as np
(Pdb) b 40
Breakpoint 1 at /Users/mlm/pso.py:40
(Pdb) c
> /Users/mlm/pso.py(40)update()
-> obj = f(X[0], X[1])
(Pdb) bt
/usr/local/Cellar/[email protected]/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run()
-> exec(cmd, globals, locals)
<string>(1)<module>()
/Users/mlm/pso.py(76)<module>()
-> anim.save(“PSO.gif”, dpi=120, writer=”imagemagick”)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1078)save()
-> anim._init_draw() # Clear the initial frame
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1698)_init_draw()
-> self._draw_frame(frame_data)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame()
-> self._drawn_artists = self._func(framedata, *self._args)
/Users/mlm/pso.py(65)animate()
-> update()
> /Users/mlm/pso.py(40)update()
-> obj = f(X[0], X[1])
(Pdb) p r1
0.8054505373292797
(Pdb) p r2
0.7543489945823536
(Pdb) p X
array([[2.77550474, 1.60073607, 2.14133019, 4.11466522, 0.2445649 ,
0.65149396, 3.24520628, 4.08804798, 0.89696478, 2.82703884,
4.42055413, 1.03681404, 0.95318658, 0.60737118, 1.17702652,
4.67551174, 3.95781321, 0.95077669, 4.08220292, 1.33330594],
[2.07985611, 4.53702225, 3.81359193, 1.83427181, 0.87867832,
1.8423856 , 0.11392109, 1.2635162 , 3.84974582, 0.27397365,
2.86219806, 3.05406841, 0.64253831, 1.85730719, 0.26090638,
4.28053621, 4.71648133, 0.44101305, 4.14882396, 2.74620598]])
(Pdb) n
> /Users/mlm/pso.py(41)update()
-> pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)]
(Pdb) n
> /Users/mlm/pso.py(42)update()
-> pbest_obj = np.array([pbest_obj, obj]).min(axis=0)
(Pdb) n
> /Users/mlm/pso.py(43)update()
-> gbest = pbest[:, pbest_obj.argmin()]
(Pdb) n
> /Users/mlm/pso.py(44)update()
-> gbest_obj = pbest_obj.min()
(Pdb)

After we set a breakpoint with the b command, we can let the debugger run our program until the breakpoint is hit. The c command means to continue until a trigger is met. At any point, we can use bt command to show the traceback to check how we reached here. We can also use the p command to print the variables (or an expression) to check what value they are holding.

Indeed, we can place a breakpoint with a condition, so that it will stop only if the condition is met. The below will impose a condition that the first random number (r1) is greater than 0.5:

(Pdb) b 40, r1 > 0.5
Breakpoint 1 at /Users/mlm/pso.py:40
(Pdb) c
> /Users/mlm/pso.py(40)update()
-> obj = f(X[0], X[1])
(Pdb) p r1, r2
(0.8054505373292797, 0.7543489945823536)
(Pdb) c
> /Users/mlm/pso.py(40)update()
-> obj = f(X[0], X[1])
(Pdb) p r1, r2
(0.5404045753007164, 0.2967937508800147)
(Pdb)

Indeed, we can also try to manipulate variables while we are debugging.

(Pdb) l
35 global V, X, pbest, pbest_obj, gbest, gbest_obj
36 # Update params
37 r1, r2 = np.random.rand(2)
38 V = w * V + c1*r1*(pbest – X) + c2*r2*(gbest.reshape(-1,1)-X)
39 X = X + V
40 B-> obj = f(X[0], X[1])
41 pbest[:, (pbest_obj >= obj)] = X[:, (pbest_obj >= obj)]
42 pbest_obj = np.array([pbest_obj, obj]).min(axis=0)
43 gbest = pbest[:, pbest_obj.argmin()]
44 gbest_obj = pbest_obj.min()
45
(Pdb) p V
array([[ 0.03742722, 0.20930531, 0.06273426, -0.1710678 , 0.33629384,
0.19506555, -0.10238065, -0.12707257, 0.28042122, -0.03250191,
-0.14004886, 0.13224399, 0.16083673, 0.21198813, 0.17530208,
-0.27665503, -0.15344393, 0.20079061, -0.10057509, 0.09128536],
[-0.05034548, -0.27986224, -0.30725954, 0.11214169, 0.0934514 ,
0.00335978, 0.20517519, 0.06308483, -0.22007053, 0.26176423,
-0.12617228, -0.05676629, 0.18296986, -0.01669114, 0.18934933,
-0.27623121, -0.32482898, 0.213894 , -0.34427909, -0.12058168]])
(Pdb) p r1, r2
(0.5404045753007164, 0.2967937508800147)
(Pdb) r1 = 0.2
(Pdb) p r1, r2
(0.2, 0.2967937508800147)
(Pdb) j 38
> /Users/mlm/pso.py(38)update()
-> V = w * V + c1*r1*(pbest – X) + c2*r2*(gbest.reshape(-1,1)-X)
(Pdb) n
> /Users/mlm/pso.py(39)update()
-> X = X + V
(Pdb) p V
array([[ 0.02680837, 0.16594979, 0.06350735, -0.15577623, 0.30737655,
0.19911613, -0.08242418, -0.12513798, 0.24939995, -0.02217463,
-0.13474876, 0.14466204, 0.16661846, 0.21194543, 0.16952298,
-0.24462505, -0.138997 , 0.19377154, -0.10699911, 0.10631063],
[-0.03606147, -0.25128615, -0.26362411, 0.08163408, 0.09842085,
0.00765688, 0.19771385, 0.06597805, -0.20564599, 0.23113388,
-0.0956787 , -0.07044121, 0.16637064, -0.00639259, 0.18245734,
-0.25698717, -0.30336147, 0.19354112, -0.29904698, -0.08810355]])
(Pdb)

In the above, we use l command to list the code around the current statement (identified by the arrow ->). In the listing, we can also see the breakpoint (marked with B) is set at line 40. As we can see the current value of V and r1, we can modify r1 from 0.54 to 0.2 and run the statement on V again by using j (jump) to line 38. And as we see after we execute the statement with n command, the value of V is changed.

If we use a breakpoint and found something unexpected, chances are that it was caused by issues in a different level of the call stack. Debuggers would allow you to navigate to different levels:

(Pdb) bt
/usr/local/Cellar/[email protected]/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run()
-> exec(cmd, globals, locals)
<string>(1)<module>()
/Users/mlm/pso.py(76)<module>()
-> anim.save(“PSO.gif”, dpi=120, writer=”imagemagick”)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1091)save()
-> anim._draw_next_frame(d, blit=False)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1126)_draw_next_frame()
-> self._draw_frame(framedata)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame()
-> self._drawn_artists = self._func(framedata, *self._args)
/Users/mlm/pso.py(65)animate()
-> update()
> /Users/mlm/pso.py(39)update()
-> X = X + V
(Pdb) up
> /Users/mlm/pso.py(65)animate()
-> update()
(Pdb) bt
/usr/local/Cellar/[email protected]/3.9.9/Frameworks/Python.framework/Versions/3.9/lib/python3.9/bdb.py(580)run()
-> exec(cmd, globals, locals)
<string>(1)<module>()
/Users/mlm/pso.py(76)<module>()
-> anim.save(“PSO.gif”, dpi=120, writer=”imagemagick”)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1091)save()
-> anim._draw_next_frame(d, blit=False)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1126)_draw_next_frame()
-> self._draw_frame(framedata)
/usr/local/lib/python3.9/site-packages/matplotlib/animation.py(1720)_draw_frame()
-> self._drawn_artists = self._func(framedata, *self._args)
> /Users/mlm/pso.py(65)animate()
-> update()
/Users/mlm/pso.py(39)update()
-> X = X + V
(Pdb) l
60
61 def animate(i):
62 “Steps of PSO: algorithm update and show in plot”
63 title = ‘Iteration {:02d}’.format(i)
64 # Update params
65 -> update()
66 # Set picture
67 ax.set_title(title)
68 pbest_plot.set_offsets(pbest.T)
69 p_plot.set_offsets(X.T)
70 p_arrow.set_offsets(X.T)
(Pdb) p title
‘Iteration 02’
(Pdb)

In the above, the first bt command gives the call stack when we are at the bottom frame, i.e., the deepest of the call stack. We can see that we are about to execute the statement X = X + V. Then the up command moves our focus to one level up on the call stack, which is the line running update() function (as we see at the line preceded with >). Since our focus is changed, the list command l will print a different fragment of code and the p command can examine a variable in a different scope.

The above covers most of the useful commands in the debugger. If we want to terminate the debugger (which also terminates the program), we can use the q command to quit or hit Ctrl-D if your terminal supports.

Debugger in Visual Studio Code

If you are not very comfortable to run the debugger in command line, you can rely on the debugger from your IDE. Almost always the IDE will provide you some debugging facility. In Visual Studio Code for example, you can launch the debugger in the “Run” menu.

The screen below shows Visual Studio Code at debugging session. The buttons at the center top are correspond to pdb commands continue, next, step, return, restart, and quit respectively. A breakpoint can be created by clicking on the line number, which a red dot will be appeared to identify that. The bonus of using an IDE is that the variables are shown immediately at each debugging step. We can also watch for an express and show the call stack. These are at left side of the screen below.

Using GDB on a running Python program

The pdb from Python is suitable only for programs running from scratch. If we have a program already running but stuck, we cannot use pdb to hook into it to check what’s going on. The Python extension from GDB, however, can do this.

To demonstrate, let’s consider a GUI application. It will wait until user’s action before the program can end. Hence it is a perfect example to see how we can use gdb to hook into a running process. The code below is a “hello world” program using PyQt5 that just create an empty window and waiting for user to close it:

import sys
from PyQt5.QtWidgets import QApplication, QWidget, QMainWindow

class Frame(QMainWindow):
def __init__(self):
super().__init__()
self.initUI()
def initUI(self):
self.setWindowTitle(“Simple title”)
self.resize(800,600)

def main():
app = QApplication(sys.argv)
frame = Frame()
frame.show()
sys.exit(app.exec_())

if __name__ == ‘__main__’:
main()

Let’s save this program as simpleqt.py and run it using the following in Linux under X window environment:

python simpleqt.py &

The final & will make it run in background. Now we can check for its process ID using the ps command:

ps a | grep python


3997 pts/1 Sl 0:00 python simpleqt.py

The ps command will tell you the process ID at the first column. If you have gdb installed with python extension, we can run

gdb python 3997

and it will bring you into the GDB’s prompt:

GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type “show copying” and “show warranty” for details.
This GDB was configured as “x86_64-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from python…
Reading symbols from /usr/lib/debug/.build-id/f9/02f8a561c3abdb9c8d8c859d4243bd8c3f928f.debug…
Attaching to program: /usr/local/bin/python, process 3997
[New LWP 3998]
[New LWP 3999]
[New LWP 4001]
[New LWP 4002]
[New LWP 4003]
[New LWP 4004]
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
0x00007fb11b1c93ff in __GI___poll (fds=0x7fb110007220, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
(gdb) py-bt
Traceback (most recent call first):
<built-in method exec_ of QApplication object at remote 0x7fb115f64c10>
File “/mnt/data/simpleqt.py”, line 16, in main
sys.exit(app.exec_())
File “/mnt/data/simpleqt.py”, line 19, in <module>
main()
(gdb) py-list
11
12 def main():
13 app = QApplication(sys.argv)
14 frame = Frame()
15 frame.show()
>16 sys.exit(app.exec_())
17
18 if __name__ == ‘__main__’:
19 main()
(gdb)

GDB is supposed to be a debugger for compiled programs (usually from C or C++). The Python extension allows you to check the code (written in Python) being run by the Python interpreter (which is written in C). It is less feature-rich than the Python’s pdb in terms of handling Python code but useful when you want to need to hook into a running process.

The command supported under GDB are py-list, py-bt, py-up, py-down, and py-print. They are comparable to the same commands in pdb without the py- prefix.

GDB is useful if your Python code uses a library that is compiled from C (such as numpy) and want to investigate how the it runs. It is also useful to learn why your program is frozen by checking the call stack in run time. However, it may be rarely the case that you need to use GDB to debug your machine learning project.

Further Readings

The Python pdb module’s document is at

https://docs.python.org/3/library/pdb.html

But pdb is not the only debugger available. Some third-party tools are listed in:

Python Debugging Tools wiki page

For GDB with Python extension, it is most mature to be used in Linux environment. Please see the following for more details on its usage:

Easier Python Debugging
Debugging with GDB

The command interface of pdb is influenced by that of GDB. Hence we can learn the technique of debugging a program in general from the latter. A good primer on how to use a debugger would be

The Art of Debugging with GDB, DDD, and Eclipse, by Norman Matloff (2008)

Summary

In this tutorial, you discovered the features of Python’s pdb

Specifically, you learned:

What can pdb do and how to use it
The limitation and alternatives of pdb

In the next post, we will see that pdb is also a Python function that can be called inside a Python program.



The post Python debugging tools appeared first on Machine Learning Mastery.

Read MoreMachine Learning Mastery

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments