Setting the Baseline

We are now ready for some example code and experiments. Let’s get to work!

Our test function

We first define a function that we can use to benchmark our different options. All the following examples use the same function, called heavy :

def heavy(n, myid):
  for x in range(1, n):
    for y in range(1, n):
      x**y
  print(myid, "is done")

The heavy function is a nested loop that does multiplication. It is a CPU-bound function. If you observe your system while running this, you’ll see CPU usage close to 100% (for one core). You can replace it with anything you want, but beware of race conditions — don’t use shared objects or variables.

We’ll be running this function in different ways and explore the differences between a regular, single thread Python program, multithreading, and multiprocessing.

Option 1: The baseline

Each Python program has at least one thread: the main thread. Below you’ll find the single-threaded version, which serves as our baseline in terms of speed. It runs our heavy function 80 times, sequentially:

import time

# A CPU heavy calculation, just
# as an example. This can be
# anything you like
def heavy(n, myid):
    for x in range(1, n):
        for y in range(1, n):
            x**y
    print(myid, "is done")

def sequential(n):
    for i in range(n):    
        heavy(500, i)

if __name__ == "__main__":
    start = time.time()
    sequential(80)
    end = time.time()
    print("Took: ", end - start)

On my system, this takes about 46 seconds to run to completion.

Note that the if __name__ == "__main__": part is required for this to work on Windows computers, but it's good form to always use it.


If you liked this page, please share it with a fellow learner: