Python Threading: Concurrent Execution
Threading enables concurrent execution within a single process, allowing your Python programs to handle multiple operations simultaneously. Understanding when to use threading requires distinguishing...
Key Insights
- Python threading excels at I/O-bound tasks like network requests or file operations, but the Global Interpreter Lock (GIL) prevents true parallel execution of CPU-bound code across multiple threads.
- Race conditions are inevitable when multiple threads access shared data—always protect mutable state with locks or use thread-safe data structures like Queue.
- ThreadPoolExecutor provides a cleaner, more maintainable interface than raw threads for most concurrent workloads, with built-in resource management and future-based result handling.
Introduction to Threading Basics
Threading enables concurrent execution within a single process, allowing your Python programs to handle multiple operations simultaneously. Understanding when to use threading requires distinguishing between concurrency and parallelism.
Concurrency means managing multiple tasks that make progress during overlapping time periods. Parallelism means executing multiple tasks simultaneously on different CPU cores. Python’s Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time, which means threading provides concurrency but not true parallelism for CPU-bound work.
Threading shines for I/O-bound tasks—operations that spend time waiting for external resources like network responses, disk reads, or database queries. While one thread waits for I/O, other threads can execute, dramatically improving throughput.
Here’s a concrete comparison:
import time
import threading
import requests
def fetch_url(url):
response = requests.get(url)
return len(response.content)
urls = [
'https://www.python.org',
'https://www.github.com',
'https://www.stackoverflow.com',
'https://www.reddit.com'
]
# Single-threaded approach
start = time.time()
for url in urls:
fetch_url(url)
single_threaded_time = time.time() - start
# Multi-threaded approach
start = time.time()
threads = []
for url in urls:
thread = threading.Thread(target=fetch_url, args=(url,))
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
multi_threaded_time = time.time() - start
print(f"Single-threaded: {single_threaded_time:.2f}s")
print(f"Multi-threaded: {multi_threaded_time:.2f}s")
The multi-threaded version typically completes 3-4x faster because threads can wait for network I/O concurrently.
Creating and Starting Threads
Python’s threading module provides the Thread class for creating threads. You can create threads two ways: passing a target function or subclassing Thread.
The target function approach is straightforward and preferred for simple cases:
import threading
import time
def worker(name, delay):
print(f"Thread {name} starting")
time.sleep(delay)
print(f"Thread {name} finished after {delay}s")
# Create threads with target function
t1 = threading.Thread(target=worker, args=("A", 2))
t2 = threading.Thread(target=worker, args=("B", 1))
# Start threads
t1.start()
t2.start()
# Wait for completion
t1.join()
t2.join()
print("All threads completed")
For more complex behavior, subclass Thread and override the run() method:
import threading
import time
class WorkerThread(threading.Thread):
def __init__(self, name, delay):
super().__init__()
self.worker_name = name
self.delay = delay
self.result = None
def run(self):
print(f"Worker {self.worker_name} starting")
time.sleep(self.delay)
self.result = f"Processed by {self.worker_name}"
print(f"Worker {self.worker_name} finished")
# Create and start threads
threads = [WorkerThread("Worker-1", 1), WorkerThread("Worker-2", 2)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Result: {t.result}")
The start() method initiates the thread, which calls run() in a separate execution context. The join() method blocks until the thread completes, ensuring proper cleanup.
Thread Synchronization with Locks
When multiple threads access shared mutable state, race conditions occur. Consider this broken counter:
import threading
counter = 0
def increment():
global counter
for _ in range(100000):
counter += 1
threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Counter: {counter}") # Expected: 500000, Actual: varies!
The counter will likely be less than 500,000 because counter += 1 isn’t atomic—it involves reading, incrementing, and writing. Multiple threads can read the same value before any writes occur.
Fix this with a Lock:
import threading
counter = 0
counter_lock = threading.Lock()
def increment():
global counter
for _ in range(100000):
with counter_lock:
counter += 1
threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"Counter: {counter}") # Always 500000
The with counter_lock statement acquires the lock before entering the block and releases it afterward, ensuring only one thread modifies the counter at a time.
Use RLock (reentrant lock) when a thread needs to acquire the same lock multiple times:
import threading
class BankAccount:
def __init__(self):
self.balance = 0
self.lock = threading.RLock()
def deposit(self, amount):
with self.lock:
self.balance += amount
def transfer_from(self, other_account, amount):
with self.lock: # Acquires lock for this account
with other_account.lock: # Acquires lock for other account
self.balance -= amount
other_account.deposit(amount) # deposit() also acquires lock
Thread Communication and Coordination
Threads often need to coordinate actions or exchange data. Python provides several synchronization primitives.
The Queue class provides thread-safe FIFO data exchange, perfect for producer-consumer patterns:
import threading
import queue
import time
import random
def producer(q, producer_id):
for i in range(5):
item = f"Item-{producer_id}-{i}"
time.sleep(random.uniform(0.1, 0.5))
q.put(item)
print(f"Producer {producer_id} produced {item}")
q.put(None) # Sentinel value
def consumer(q, consumer_id):
while True:
item = q.get()
if item is None:
q.put(None) # Re-add sentinel for other consumers
break
time.sleep(random.uniform(0.1, 0.3))
print(f"Consumer {consumer_id} consumed {item}")
q.task_done()
# Create queue
work_queue = queue.Queue()
# Start producers and consumers
producers = [threading.Thread(target=producer, args=(work_queue, i))
for i in range(2)]
consumers = [threading.Thread(target=consumer, args=(work_queue, i))
for i in range(3)]
for t in producers + consumers:
t.start()
for t in producers + consumers:
t.join()
Event objects allow threads to signal each other:
import threading
import time
event = threading.Event()
def waiter():
print("Waiting for event...")
event.wait() # Blocks until event is set
print("Event received, proceeding!")
def setter():
time.sleep(2)
print("Setting event")
event.set()
threading.Thread(target=waiter).start()
threading.Thread(target=setter).start()
Thread Pools with concurrent.futures
Managing individual threads becomes unwieldy for large workloads. ThreadPoolExecutor provides a high-level interface for thread pools:
from concurrent.futures import ThreadPoolExecutor, as_completed
import requests
import time
def fetch_url(url):
start = time.time()
response = requests.get(url)
duration = time.time() - start
return {
'url': url,
'status': response.status_code,
'size': len(response.content),
'duration': duration
}
urls = [
'https://www.python.org',
'https://www.github.com',
'https://www.stackoverflow.com',
'https://www.reddit.com',
'https://www.wikipedia.org'
]
# Using map() for ordered results
with ThreadPoolExecutor(max_workers=3) as executor:
results = executor.map(fetch_url, urls)
for result in results:
print(f"{result['url']}: {result['status']} ({result['size']} bytes)")
# Using submit() for more control
with ThreadPoolExecutor(max_workers=3) as executor:
futures = {executor.submit(fetch_url, url): url for url in urls}
for future in as_completed(futures):
url = futures[future]
try:
result = future.result()
print(f"Completed {url}: {result['duration']:.2f}s")
except Exception as e:
print(f"Failed {url}: {e}")
The context manager ensures proper cleanup. map() maintains input order while submit() with as_completed() processes results as they finish.
Best Practices and Common Pitfalls
Always use context managers with ThreadPoolExecutor to ensure threads are properly cleaned up:
from concurrent.futures import ThreadPoolExecutor
# Good: Context manager handles cleanup
with ThreadPoolExecutor(max_workers=4) as executor:
results = executor.map(process_item, items)
# Avoid: Manual cleanup required
executor = ThreadPoolExecutor(max_workers=4)
results = executor.map(process_item, items)
executor.shutdown(wait=True) # Easy to forget
Handle exceptions in threads properly—unhandled exceptions are silently swallowed:
import threading
import traceback
def risky_operation():
try:
# Your code here
raise ValueError("Something went wrong")
except Exception:
traceback.print_exc() # Print the exception
thread = threading.Thread(target=risky_operation)
thread.start()
thread.join()
Understand daemon threads—they terminate when the main program exits:
import threading
import time
def background_task():
while True:
print("Working...")
time.sleep(1)
# Daemon thread won't prevent program exit
t = threading.Thread(target=background_task, daemon=True)
t.start()
time.sleep(3) # Program exits after 3 seconds
Choose the right concurrency model. Use threading for I/O-bound tasks, multiprocessing for CPU-bound work, and asyncio for high-concurrency I/O with less overhead. Threading adds memory overhead (each thread needs its own stack) and introduces complexity through shared state.
Avoid deadlocks by always acquiring locks in the same order and using timeouts:
lock1 = threading.Lock()
lock2 = threading.Lock()
def safe_operation():
# Always acquire locks in the same order
with lock1:
with lock2:
# Critical section
pass
Threading is powerful for I/O-bound concurrency, but requires careful attention to synchronization and resource management. Start with ThreadPoolExecutor for most use cases, and only drop down to raw threads when you need fine-grained control.