The Python GIL

Python has one peculiarity that makes concurrent programming harder. It’s called the GIL, short for Global Interpreter Lock. The GIL makes sure there is, at any time, only one thread running. Because only one thread can run at a time, it’s impossible to make use of multiple processors with threads. But don’t worry, there’s a way around this.

The GIL was invented because CPython’s memory management is not thread-safe. With only one thread running at a time, CPython can rest assured there will never be race conditions.

Race conditions? Thread-safety?

These terms might be new to you, so let's define them:

Race condition
A race condition occurs when multiple threads can access and change shared data at the same time.

As mentioned already, threads share the same memory. With multiple threads running at the same time, we don’t know the order in which the threads access shared data. Therefore, the result of accessing shared data is dependent on the scheduling algorithm. This algorithm decides which thread runs when. Threads are “racing” to access/change the data.

Thread safety
Thread-safe code only manipulates shared data in such a way, that it does not interfere with other threads.

As an example, let’s create a shared variable a, with a value of 2:

a = 2

Now suppose we have two threads, thread_one and thread_two. They perform the following operations:

  • thread_one: a = a + 2
  • thread_two: a = a * 3

If thread_one is able to access a first and thread_two second, the result will be:

  • a = 2 + 2, a is now 4.
  • a = 4 * 3, a is now 12.

However, if it so happens that thread_two runs first, and then thread_one, we get a different output:

  • a = 2 * 3, a is now 6
  • a = 6 + 2, a is now 8

So the order of execution obviously matters for the output. There’s an even worse possible outcome though! What if both threads read a at the same time, do their thing, and then assign the new value? They will both see that a = 2. Depending on who writes its result first, a will eventually be 4 or 6. Not what we expected! This is what we call a race condition.

Race conditions are difficult to spot, especially for software engineers that are unfamiliar with these issues. Also, they tend to occur randomly, causing erratic and unpredictable behavior. These bugs are notoriously difficult to find and debug. It’s exactly why Python has a GIL — to make life easier for the majority of Python users.

But if the GIL holds us back in terms of concurrency, shouldn’t we get rid of it or be able to turn it off? It’s not that easy. Other features, libraries, and packages have come to rely on the GIL, so something must replace it or else the entire ecosystem will break. This turns out to be a difficult problem to solve. If it interests you, you can read more about this on the Python wiki.


If you liked this page, please share it with a fellow learner: