Mutlithreading and Multiprocessing in Python

This video visually explains how both work

Why can't threads run in parallel, but processes can?

This is Python-specific due to the Global Interpreter Lock (GIL):

Threads + GIL:

Python has a lock that allows only ONE thread to execute Python code at a time
Even with multiple threads, only one can run Python bytecode at any moment
They take turns very rapidly, which helps with I/O (while one waits, another runs)
But for CPU work (calculations), they can't truly work in parallel

Processes:

Each process has its own Python interpreter and its own GIL
They can run truly in parallel on different CPU cores
No shared GIL = no bottleneck

Note: Other languages (like Java, C++) don't have this limitation—their threads can run truly in parallel.

Hardware Level

This is roughly what happens when your OS manages processes and threads.

Your computer has:

CPU cores: Physical processing units (e.g., 4 cores, 8 cores)
RAM: Memory for storing data

Process (Managed by Operating System)

Think of it as an instance of a running program with its own resources (isolated memory space , file handles, threads, etc.)
The OS allocates RAM to it
Has at least one thread
Example: Opening Chrome creates a process, opening VS Code creates another process

Thread (Within a Process)

A sequence of instructions that can be scheduled by the OS. The smallest unit of CPU execution
Multiple threads share the same process's memory
The OS can run different threads on different CPU cores
Example: Chrome's main process might have one thread for the UI, one for network requests, one for rendering

How Python's GIL Limits multithreading

CPU Core 1          CPU Core 2          CPU Core 3
    |                   |                   |
Python Process (with GIL)
    |
  Thread 1 --> [Running]
  Thread 2 --> [Waiting for GIL]
  Thread 3 --> [Waiting for GIL]
    |
  [Shared RAM]

Only one Python thread executes at a time, even on multi-core CPUs.

When are threads faster/slower than processes?

Threads are FASTER when:

Startup time: Creating a thread takes microseconds; a process takes milliseconds
Memory usage: Threads share memory, processes duplicate it
Communication: Threads can share variables directly; processes need IPC (inter-process communication)
I/O-bound work: In Python, threads are perfect for waiting on I/O since the GIL releases during I/O operations

Processes are FASTER when:

CPU-bound work: Heavy calculations that need true parallelism
You have multiple CPU cores: Processes can use all cores simultaneously
Long-running tasks: The startup overhead becomes negligible

Example Timings (rough estimates):

I/O-bound (network requests):

Async:     ~2 seconds (most efficient)
Threads:   ~2 seconds (good)
Processes: ~3 seconds (unnecessary overhead)

CPU-bound (heavy calculations):

Async:     ~10 seconds (can't help, single-threaded)
Threads:   ~10 seconds (GIL prevents parallelism)
Processes: ~3 seconds (true parallelism on 4 cores)

What is thread-safe code?

Thread-safe code means code that works correctly when accessed by multiple threads simultaneously. Mutexes are one of the primary tools for achieving thread-safety.

Code becomes thread-safe when you properly protect shared mutable state. For example:

# NOT thread-safe
counter = 0
def increment():
    counter += 1  # Race condition!

# Thread-safe with mutex
counter = 0
mutex = Lock()
def increment():
    with mutex:
        counter += 1  # Protected!

A mutex (short for "mutual exclusion") is a synchronization primitive used in concurrent programming to protect shared resources from being accessed by multiple threads simultaneously.

The Problem Mutexes Solve

When multiple threads try to read and modify the same data at the same time, you can get race conditions. For example, imagine two threads both trying to increment a counter:

Thread A reads counter (value: 5)
Thread B reads counter (value: 5)
Thread A increments and writes (counter = 6)
Thread B increments and writes (counter = 6)

You expected 7, but got 6! The operations interleaved in a problematic way.

How mutexes work

A mutex acts like a lock. Before accessing shared data, a thread must "acquire" or "lock" the mutex. When done, it "releases" or "unlocks" it. Key properties:

Only one thread can hold the lock at a time
Other threads trying to acquire a locked mutex will block (wait) until it's released
This ensures only one thread accesses the protected resource at a time

Basic Pattern

Thread acquires mutex
  → Access/modify shared data
  → Release mutex

While one thread holds the mutex, all other threads attempting to acquire it must wait, preventing simultaneous access to the critical section of code.

However, mutexes aren't the only way to achieve thread-safety. Other approaches include:

Immutable data - if data never changes, there's nothing to protect
Thread-local storage - each thread has its own copy
Atomic operations - special CPU instructions that are inherently thread-safe
Lock-free data structures - clever algorithms that avoid locks entirely

Why many developers shy away from multithreading?

Many experienced developers are cautious about multithreading because it's genuinely difficult:

1. Hard to Debug - Race conditions are non-deterministic. A bug might appear randomly, be impossible to reproduce, and disappear when you add logging (because logging changes timing).

2. Deadlocks - When threads wait for each other's locks in a cycle, everything freezes. These can be subtle and occur rarely in production.

3. Performance Pitfalls - Multithreading doesn't automatically make things faster. Lock contention, context switching, and coordination overhead can actually slow things down. Sometimes single-threaded code is faster.

4. Complexity - Reasoning about all possible interleavings of operations is mentally exhausting. The code becomes harder to understand and maintain.

5. Alternatives Exist - Async/await, event loops, and message-passing models (like Go's goroutines or Erlang's actors) often provide easier concurrency without shared memory.

The general wisdom is: avoid shared mutable state when possible, and only use multithreading when you genuinely need parallel execution and the benefits outweigh the complexity costs.

Previousasyncio in Python NextGoroutines and channels in Golang

Last updated 2 months ago

hashtagWhy can't threads run in parallel, but processes can?

hashtagHardware Level

hashtagWhen are threads faster/slower than processes?

hashtagThreads are FASTER when:

hashtagProcesses are FASTER when:

hashtagExample Timings (rough estimates):

hashtagWhat is thread-safe code?

hashtagWhy many developers shy away from multithreading?