Condition Variables: Thread Synchronization
Condition variables solve a fundamental problem in concurrent programming: how do you make a thread wait for something to happen without burning CPU cycles? The naive approach—spinning in a loop...
Key Insights
- Condition variables allow threads to sleep efficiently until a specific condition becomes true, eliminating wasteful busy-waiting that burns CPU cycles
- Always check conditions in a while loop, never an if statement—spurious wakeups are real and will cause subtle bugs in production
- Pair every condition variable with both a mutex and a predicate; the mutex protects the shared state, the predicate defines when threads should proceed
Introduction to Condition Variables
Condition variables solve a fundamental problem in concurrent programming: how do you make a thread wait for something to happen without burning CPU cycles? The naive approach—spinning in a loop checking a flag—works but wastes enormous resources. A thread checking a boolean a million times per second accomplishes nothing productive while starving other threads of CPU time.
Condition variables provide an efficient alternative. They let a thread say “wake me up when something interesting happens” and then go to sleep. The operating system removes the thread from the scheduler entirely until another thread signals that the condition might have changed. This is the foundation of nearly every higher-level synchronization primitive you’ll encounter.
The Problem: Coordinating Thread Communication
Mutexes protect shared data from concurrent access, but they don’t help threads coordinate. Consider these scenarios:
- A worker thread needs to wait for tasks to appear in a queue
- A producer needs to pause when a buffer is full
- Multiple threads need to wait for an initialization phase to complete
- A thread pool needs to wake exactly one idle worker when work arrives
In each case, one thread needs to wait for a condition that another thread will make true. Mutexes alone can’t express this. You could poll:
// DON'T DO THIS - burns CPU and holds the lock
void bad_consumer(std::queue<Task>& queue, std::mutex& mtx) {
while (true) {
std::lock_guard<std::mutex> lock(mtx);
if (!queue.empty()) {
Task t = queue.front();
queue.pop();
process(t);
}
// Lock released, immediately re-acquired - spinning
}
}
This code has two problems. First, it holds the mutex almost continuously, blocking producers. Second, when the queue is empty, it spins uselessly. On a 4-core machine with 4 consumer threads all spinning on an empty queue, you’ve consumed 100% CPU doing nothing.
Anatomy of a Condition Variable
A condition variable provides three operations:
- Wait: Atomically release the mutex and sleep until signaled
- Signal (notify_one): Wake one waiting thread
- Broadcast (notify_all): Wake all waiting threads
The critical insight is that wait and the mutex work together atomically. When you call wait, the mutex is released and the thread sleeps as a single atomic operation. This prevents a race where a signal could be sent between releasing the lock and going to sleep.
#include <condition_variable>
#include <mutex>
#include <queue>
class TaskQueue {
std::queue<Task> queue_;
std::mutex mutex_;
std::condition_variable not_empty_;
public:
void push(Task t) {
{
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(std::move(t));
}
not_empty_.notify_one(); // Wake one waiting consumer
}
Task pop() {
std::unique_lock<std::mutex> lock(mutex_);
// Wait until queue is not empty
while (queue_.empty()) {
not_empty_.wait(lock); // Releases lock, sleeps, reacquires
}
Task t = std::move(queue_.front());
queue_.pop();
return t;
}
};
Notice that pop() uses std::unique_lock rather than lock_guard. This is required because wait() needs to release and reacquire the lock, which lock_guard doesn’t support.
The Spurious Wakeup Problem
You might wonder why pop() uses a while loop instead of an if statement. The answer is spurious wakeups: threads can wake from wait() without anyone calling notify_one() or notify_all().
This isn’t a bug—it’s a deliberate design choice in most implementations. The POSIX specification explicitly allows spurious wakeups because forbidding them would require expensive synchronization that penalizes the common case. Some hardware architectures make spurious wakeups unavoidable.
// WRONG - will break on spurious wakeup
Task pop_broken() {
std::unique_lock<std::mutex> lock(mutex_);
if (queue_.empty()) { // BUG: should be while
not_empty_.wait(lock);
}
// Spurious wakeup: queue might still be empty!
Task t = std::move(queue_.front()); // Undefined behavior
queue_.pop();
return t;
}
C++ provides a convenient overload that handles the loop for you:
Task pop() {
std::unique_lock<std::mutex> lock(mutex_);
not_empty_.wait(lock, [this] { return !queue_.empty(); });
Task t = std::move(queue_.front());
queue_.pop();
return t;
}
This predicate form is equivalent to the while loop but more concise and harder to get wrong. Use it whenever possible.
Classic Pattern: Producer-Consumer Queue
A bounded queue needs two condition variables: one for “not empty” (consumers wait on this) and one for “not full” (producers wait on this). Here’s a complete implementation:
#include <condition_variable>
#include <mutex>
#include <queue>
#include <optional>
template <typename T>
class BoundedQueue {
std::queue<T> queue_;
const size_t capacity_;
std::mutex mutex_;
std::condition_variable not_empty_;
std::condition_variable not_full_;
bool shutdown_ = false;
public:
explicit BoundedQueue(size_t capacity) : capacity_(capacity) {}
// Returns false if queue is shut down
bool push(T item) {
std::unique_lock<std::mutex> lock(mutex_);
not_full_.wait(lock, [this] {
return queue_.size() < capacity_ || shutdown_;
});
if (shutdown_) return false;
queue_.push(std::move(item));
not_empty_.notify_one();
return true;
}
// Returns nullopt if queue is empty and shut down
std::optional<T> pop() {
std::unique_lock<std::mutex> lock(mutex_);
not_empty_.wait(lock, [this] {
return !queue_.empty() || shutdown_;
});
if (queue_.empty()) return std::nullopt; // Shutdown, no more items
T item = std::move(queue_.front());
queue_.pop();
not_full_.notify_one();
return item;
}
void shutdown() {
{
std::lock_guard<std::mutex> lock(mutex_);
shutdown_ = true;
}
// Wake ALL waiting threads so they can observe shutdown
not_empty_.notify_all();
not_full_.notify_all();
}
};
Usage with multiple producers and consumers:
#include <thread>
#include <vector>
#include <iostream>
int main() {
BoundedQueue<int> queue(10);
std::vector<std::thread> threads;
// Start 3 producers
for (int p = 0; p < 3; ++p) {
threads.emplace_back([&queue, p] {
for (int i = 0; i < 100; ++i) {
queue.push(p * 1000 + i);
}
});
}
// Start 2 consumers
for (int c = 0; c < 2; ++c) {
threads.emplace_back([&queue, c] {
while (auto item = queue.pop()) {
std::cout << "Consumer " << c << " got " << *item << "\n";
}
});
}
// Wait for producers to finish
for (int i = 0; i < 3; ++i) {
threads[i].join();
}
// Signal consumers to exit
queue.shutdown();
// Wait for consumers
for (int i = 3; i < 5; ++i) {
threads[i].join();
}
}
Common Pitfalls and Best Practices
Lost wakeups occur when you signal before the other thread starts waiting. The signal is lost forever. Always modify the shared state before signaling, and always check the state in the wait predicate:
// WRONG ORDER - potential lost wakeup
void broken_push(T item) {
not_empty_.notify_one(); // Signal sent...
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(std::move(item)); // ...but state changes after
}
Holding locks during notify is legal but often suboptimal. The awakened thread immediately tries to acquire the mutex, so if you’re still holding it, the thread wakes only to block again:
// Suboptimal - notified thread blocks on mutex
void push(T item) {
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(std::move(item));
not_empty_.notify_one(); // Thread wakes, then blocks
}
// Better - notified thread can proceed immediately
void push(T item) {
{
std::lock_guard<std::mutex> lock(mutex_);
queue_.push(std::move(item));
}
not_empty_.notify_one(); // Thread wakes and acquires lock
}
Choosing signal vs. broadcast: Use notify_one when only one thread can make progress (one item added to queue). Use notify_all when multiple threads might be able to proceed or when the condition affects all waiters (shutdown, configuration change). When in doubt, notify_all is always correct—just potentially less efficient.
Consider alternatives for specific patterns. Semaphores express counting constraints more directly. Futures and promises are cleaner for one-shot results. Channels (like Go’s) combine the queue and synchronization into one abstraction. But condition variables remain the fundamental building block underneath all of these.
Conclusion
Condition variables are the primitive that makes efficient thread coordination possible. They transform wasteful spinning into efficient sleeping, letting the OS scheduler do what it does best. Master the pattern—mutex, predicate, while loop—and you can build any synchronization structure you need.
Use condition variables when threads need to wait for application-specific conditions: queues becoming non-empty, resources becoming available, or phases completing. Pair them with mutexes that protect the shared state being tested. Always use predicate loops to handle spurious wakeups. And remember that higher-level abstractions like thread pools, futures, and channels are built on exactly these primitives.