multithreading

Multitasking can be accomplished by multiple processes or by multiple threads within a single process.

As we mentioned earlier, a process consists of several threads, and a process has at least one thread.

Since threads are execution units directly supported by the operating system, high-level programming languages typically have built-in support for multithreading—and Python is no exception. Furthermore, Python threads are true Posix Threads, not emulated ones.

Python’s standard library provides two modules for threading: _thread and threading. _thread is a low-level module, while threading is a high-level module that encapsulates _thread. In most cases, we only need to use the high-level threading module.

To start a thread, pass a function to create a Thread instance and then call start() to execute it:

import time, threading

# Code executed by the new thread:
def loop():
    print('thread %s is running...' % threading.current_thread().name)
    n = 0
    while n &lt; 5:
        n = n + 1
        print('thread %s >>> %s' % (threading.current_thread().name, n))
        time.sleep(1)
    print('thread %s ended.' % threading.current_thread().name)

print('thread %s is running...' % threading.current_thread().name)
t = threading.Thread(target=loop, name='LoopThread')
t.start()
t.join()
print('thread %s ended.' % threading.current_thread().name)

Execution result:

thread MainThread is running...
thread LoopThread is running...
thread LoopThread >>> 1
thread LoopThread >>> 2
thread LoopThread >>> 3
thread LoopThread >>> 4
thread LoopThread >>> 5
thread LoopThread ended.
thread MainThread ended.

Any process starts a thread by default, which we call the main thread. The main thread can start new threads. Python’s threading module provides a current_thread() function that always returns the instance of the current thread. The main thread instance is named MainThread, and the names of child threads are specified during creation (we named the child thread LoopThread here). Names are only used for display purposes and have no other meaning—if not specified, Python automatically names threads as Thread-1, Thread-2, and so on.

Lock

The biggest difference between multithreading and multiprocessing is: in multiprocessing, each process has its own copy of the same variable (mutually independent), while in multithreading, all variables are shared by all threads. Therefore, any variable can be modified by any thread. The greatest danger of shared data between threads is that multiple threads modifying the same variable simultaneously can corrupt its value.

Let’s see how concurrent modification of a variable by multiple threads can corrupt its value:

# multithread
import time, threading

# Assume this is your bank balance:
balance = 0

def change_it(n):
    # Deposit first, then withdraw—result should be 0:
    global balance
    balance = balance + n
    balance = balance - n

def run_thread(n):
    for i in range(10000000):
        change_it(n)

t1 = threading.Thread(target=run_thread, args=(5,))
t2 = threading.Thread(target=run_thread, args=(8,))
t1.start()
t2.start()
t1.join()
t2.join()
print(balance)

We define a shared variable balance with an initial value of 0 and start two threads that deposit and withdraw money. Theoretically, the result should be 0. However, since thread scheduling is determined by the operating system, if t1 and t2 execute alternately and the loop runs enough times, the result of balance may not be 0.

The reason is that a single statement in a high-level language translates to several instructions when executed by the CPU. Even a simple calculation:

balance = balance + n

is split into two steps:

Calculate balance + n and store it in a temporary variable;
Assign the value of the temporary variable to balance.

This can be visualized as:

x = balance + n
balance = x

Since x is a local variable, each thread has its own x. When the code executes normally:

Initial value: balance = 0

t1: x1 = balance + 5  # x1 = 0 + 5 = 5
t1: balance = x1      # balance = 5
t1: x1 = balance - 5  # x1 = 5 - 5 = 0
t1: balance = x1      # balance = 0

t2: x2 = balance + 8  # x2 = 0 + 8 = 8
t2: balance = x2      # balance = 8
t2: x2 = balance - 8  # x2 = 8 - 8 = 0
t2: balance = x2      # balance = 0
    
Result: balance = 0

But if t1 and t2 run alternately, and the operating system executes them in the following order:

Initial value: balance = 0

t1: x1 = balance + 5  # x1 = 0 + 5 = 5

t2: x2 = balance + 8  # x2 = 0 + 8 = 8
t2: balance = x2      # balance = 8

t1: balance = x1      # balance = 5
t1: x1 = balance - 5  # x1 = 5 - 5 = 0
t1: balance = x1      # balance = 0

t2: x2 = balance - 8  # x2 = 0 - 8 = -8
t2: balance = x2      # balance = -8

Result: balance = -8

The root cause is that modifying balance requires multiple statements, and a thread may be interrupted while executing these statements—causing multiple threads to corrupt the same variable’s value.

Concurrent deposit and withdrawal operations by two threads can lead to incorrect balances. You certainly wouldn’t want your bank balance to inexplicably become negative, so we must ensure that when one thread modifies balance, no other thread can modify it simultaneously.

To ensure correct calculation of balance, we need to add a lock to change_it(). When a thread starts executing change_it(), it acquires the lock, preventing other threads from executing change_it() concurrently—they must wait until the lock is released and they acquire it themselves. Since there is only one lock, at most one thread can hold it at any time, eliminating modification conflicts. A lock is created using threading.Lock():

balance = 0
lock = threading.Lock()

def run_thread(n):
    for i in range(100000):
        # Acquire the lock first:
        lock.acquire()
        try:
            # Modify safely:
            change_it(n)
        finally:
            # Always release the lock after modification:
            lock.release()

When multiple threads execute lock.acquire() simultaneously, only one thread can successfully acquire the lock and continue execution—other threads wait until the lock becomes available.

Threads that acquire the lock must release it after use; otherwise, waiting threads will be stuck forever (becoming dead threads). We use try...finally to ensure the lock is always released.

The advantage of locks is guaranteeing that a critical section of code is executed in full by only one thread from start to finish. However, locks also have significant disadvantages:

They prevent concurrent execution of multithreading—code protected by a lock essentially runs in single-threaded mode, drastically reducing efficiency.
Multiple locks may cause deadlocks: different threads hold different locks and attempt to acquire each other’s locks, resulting in all threads hanging (unable to execute or terminate), requiring forced termination by the operating system.

Multi-core CPUs

If you have a multi-core CPU, you may wonder if multiple cores can execute multiple threads simultaneously.

What happens if we write an infinite loop?

Open Activity Monitor (Mac OS X) or Task Manager (Windows) to monitor CPU usage of a process.

We can observe that a single infinite loop thread consumes 100% of one CPU core.

With two infinite loop threads on a multi-core CPU, we see 200% CPU usage (utilizing two cores).

To fully utilize an N-core CPU, you need to start N infinite loop threads.

Try writing an infinite loop in Python:

import threading, multiprocessing

def loop():
    x = 0
    while True:
        x = x ^ 1

for i in range(multiprocessing.cpu_count()):
    t = threading.Thread(target=loop)
    t.start()

Starting N threads (matching the number of CPU cores) results in only ~102% CPU usage on a 4-core CPU (utilizing just one core).

However, rewriting the same infinite loop in C, C++, or Java fully utilizes all cores—400% on 4 cores, 800% on 8 cores. Why doesn’t Python do this?

Although Python threads are true threads, the interpreter enforces a GIL (Global Interpreter Lock): before any Python thread executes, it must acquire the GIL. After executing 100 bytecode instructions, the interpreter automatically releases the GIL to allow other threads to run. This global lock effectively serializes execution of all threads—even 100 threads on a 100-core CPU can only utilize one core.

The GIL is a historical design flaw in the official CPython interpreter. To truly leverage multi-core processing, you would need to rewrite the interpreter without the GIL.

Therefore, while Python supports multithreading, it cannot effectively utilize multi-core CPUs. If you must use multithreading for multi-core processing, you would need to implement it via C extensions—sacrificing Python’s simplicity and ease of use.

Fortunately, Python can achieve multi-core processing through multiprocessing (not multithreading). Multiple Python processes have independent GIL locks, avoiding mutual interference.

Summary

Multithreaded programming involves complex models and high risk of conflicts, requiring locks for isolation—while also being cautious of deadlocks.
Due to the GIL (Global Interpreter Lock) in Python’s design, multithreading cannot utilize multi-core CPUs. Concurrent multithreading in Python is essentially an illusion.

Reference Source Code

multi_threading.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import time, threading

def loop():
    print("thread %s is running..." % threading.current_thread().name)
    n = 0
    while n < 5:
        n = n + 1
        print("thread %s >>> %s" % (threading.current_thread().name, n))
        time.sleep(1)
    print("thread %s ended." % threading.current_thread().name)


print("thread %s is running..." % threading.current_thread().name)
t = threading.Thread(target=loop, name="LoopThread")
t.start()
t.join()
print("thread %s ended." % threading.current_thread().name)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import time, threading

def loop():
    print("thread %s is running..." % threading.current_thread().name)
    n = 0
    while n < 5:
        n = n + 1
        print("thread %s >>> %s" % (threading.current_thread().name, n))
        time.sleep(1)
    print("thread %s ended." % threading.current_thread().name)


print("thread %s is running..." % threading.current_thread().name)
t = threading.Thread(target=loop, name="LoopThread")
t.start()
t.join()
print("thread %s ended." % threading.current_thread().name)

do_lock.py

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import time, threading

balance = 0
lock = threading.Lock()


def change_it(n):
    global balance
    balance = balance + n
    balance = balance - n


def run_thread(n):
    for i in range(100000):
        lock.acquire()
        try:
            change_it(n)
        finally:
            lock.release()


t1 = threading.Thread(target=run_thread, args=(5,))
t2 = threading.Thread(target=run_thread, args=(8,))
t1.start()
t2.start()
t1.join()
t2.join()
print(balance)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import time, threading

balance = 0
lock = threading.Lock()


def change_it(n):
    global balance
    balance = balance + n
    balance = balance - n


def run_thread(n):
    for i in range(100000):
        lock.acquire()
        try:
            change_it(n)
        finally:
            lock.release()


t1 = threading.Thread(target=run_thread, args=(5,))
t2 = threading.Thread(target=run_thread, args=(8,))
t1.start()
t2.start()
t1.join()
t2.join()
print(balance)

Python for beginner

Curriculum

multithreading

Lock

Multi-core CPUs

Summary

Reference Source Code

Leave a Reply Cancel reply

Modal title