Coroutines: Cooperative Multitasking Primitives

Key Insights

Coroutines are functions that can suspend and resume execution at explicit yield points, enabling cooperative multitasking without the complexity of threads or the callback nesting of event-driven code.
The stackful vs stackless distinction fundamentally shapes what coroutines can do: stackful coroutines can yield from nested calls, while stackless variants compile to state machines with lower memory overhead.
Modern async/await syntax is syntactic sugar over coroutines—understanding the underlying mechanics helps you debug mysterious hangs, avoid blocking the event loop, and choose the right concurrency primitive for your problem.

What Are Coroutines?

Coroutines are functions that can pause their execution and later resume from where they left off. Unlike regular subroutines that run to completion once called, coroutines maintain their state across suspensions, allowing them to yield control back to their caller and pick up exactly where they stopped.

The concept dates back to 1963 when Melvin Conway coined the term while working on COBOL compilers. Simula 67 formalized coroutines as a language feature, and they’ve since appeared in Lua, Python, JavaScript, Kotlin, and virtually every modern language tackling concurrent programming.

Think of a coroutine as a checkpoint-enabled function. When it hits a yield point, it saves its local variables, instruction pointer, and stack frame, then returns control. When resumed, it restores that state and continues. This simple mechanism enables powerful patterns: generators, async I/O, cooperative schedulers, and data pipelines.

Cooperative vs Preemptive Multitasking

The critical distinction between coroutines and threads lies in who decides when to switch contexts.

Preemptive multitasking (threads) lets the operating system interrupt your code at any moment. You might be halfway through updating a data structure when the scheduler yanks control away. This necessitates locks, mutexes, and careful synchronization—the source of most concurrency bugs.

Cooperative multitasking (coroutines) requires explicit yield points. Your code runs uninterrupted until it voluntarily gives up control. No locks needed for single-threaded coroutine execution because you know exactly where context switches can occur.

def countdown(name, n):
    """A generator demonstrating explicit yield points."""
    while n > 0:
        print(f"{name}: {n}")
        yield  # Explicit suspension point - control returns to caller
        n -= 1
    print(f"{name}: Done!")

# Create two coroutines
counter_a = countdown("A", 3)
counter_b = countdown("B", 3)

# Manually interleave execution
for _ in range(3):
    next(counter_a)  # Run A until yield
    next(counter_b)  # Run B until yield

Output:

A: 3
B: 3
A: 2
B: 2
A: 1
B: 1

The trade-off is clear: cooperative scheduling is simpler and more predictable, but a misbehaving coroutine that never yields will block everything. This makes coroutines ideal for I/O-bound workloads where natural suspension points exist (waiting for network, disk, user input) and problematic for CPU-bound work that needs preemption.

Coroutine Mechanics: Stackful vs Stackless

Not all coroutines are created equal. The stackful vs stackless distinction affects what you can do and how much memory you’ll use.

Stackful coroutines (also called fibers or green threads) maintain their own call stack. They can yield from anywhere in the call hierarchy—even from deeply nested function calls. Lua and Go use this approach.

Stackless coroutines compile down to state machines. They can only yield from the coroutine function itself, not from functions it calls. Python generators and JavaScript async functions work this way.

-- Lua: Stackful coroutines can yield from nested calls
function inner()
    print("Inner: about to yield")
    coroutine.yield("from inner")  -- Yield from nested function!
    print("Inner: resumed")
end

function outer()
    print("Outer: calling inner")
    inner()
    print("Outer: inner returned")
end

co = coroutine.create(outer)
print(coroutine.resume(co))  -- Runs until yield in inner()
print(coroutine.resume(co))  -- Resumes inner(), completes

Compare with Python’s stackless approach:

def inner():
    # Can't yield here - inner() is a regular function
    print("Inner: running")
    return "from inner"

def outer():
    print("Outer: calling inner")
    result = inner()
    yield result  # Only outer() can yield
    print("Outer: resumed")

gen = outer()
print(next(gen))  # "from inner"
print(next(gen))  # StopIteration after "Outer: resumed"

Stackful coroutines are more flexible but consume more memory (typically 2KB-1MB per coroutine stack). Stackless coroutines use only the memory needed for their state variables, enabling millions of concurrent coroutines on modest hardware.

Building Blocks: Yield, Resume, and State

Coroutines communicate through three fundamental operations:

Yield: Suspend execution and optionally return a value
Resume: Continue a suspended coroutine, optionally passing in a value
State preservation: Local variables persist across suspensions

Python’s send() method demonstrates bidirectional communication:

def accumulator():
    """Coroutine that accumulates values and yields running total."""
    total = 0
    while True:
        value = yield total  # Yield current total, receive next value
        if value is None:
            break
        total += value

# Create and prime the coroutine
acc = accumulator()
next(acc)  # Advance to first yield (priming)

# Send values and receive running totals
print(acc.send(10))  # 10
print(acc.send(20))  # 30
print(acc.send(15))  # 45

Symmetric vs asymmetric coroutines differ in control flow. Asymmetric coroutines (like Python generators) always yield back to their caller. Symmetric coroutines can transfer control directly to any other coroutine—more powerful but harder to reason about. Most modern implementations choose asymmetric coroutines for simplicity.

Practical Patterns: Async I/O and Pipelines

Coroutines shine in two scenarios: non-blocking I/O and data pipelines.

Async I/O Without Callback Hell

Before async/await, handling multiple concurrent network requests meant nested callbacks or complex promise chains. Coroutines let you write sequential-looking code that executes concurrently:

import asyncio
import aiohttp

async def fetch_url(session, url):
    """Fetch a URL and return its length."""
    async with session.get(url) as response:
        content = await response.text()  # Suspend while waiting for I/O
        return len(content)

async def fetch_all(urls):
    """Fetch multiple URLs concurrently."""
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        results = await asyncio.gather(*tasks)  # Run all concurrently
        return dict(zip(urls, results))

# Usage
urls = [
    "https://httpbin.org/get",
    "https://httpbin.org/ip",
    "https://httpbin.org/headers",
]
results = asyncio.run(fetch_all(urls))
for url, length in results.items():
    print(f"{url}: {length} bytes")

While one request waits for the server, others execute. No threads, no locks, no callback pyramids.

Data Pipelines

Coroutines create elegant producer-consumer pipelines, similar to Unix pipes:

def read_lines(filename):
    """Producer: yield lines from file."""
    with open(filename) as f:
        for line in f:
            yield line.strip()

def filter_nonempty(lines):
    """Filter: pass through non-empty lines."""
    for line in lines:
        if line:
            yield line

def parse_csv(lines):
    """Transform: split CSV lines into tuples."""
    for line in lines:
        yield tuple(line.split(','))

def filter_by_column(rows, column, value):
    """Filter: keep rows where column matches value."""
    for row in rows:
        if len(row) > column and row[column] == value:
            yield row

# Compose pipeline: file -> filter empty -> parse -> filter by status
pipeline = filter_by_column(
    parse_csv(
        filter_nonempty(
            read_lines("data.csv")
        )
    ),
    column=2,
    value="active"
)

# Process lazily - only reads what's needed
for row in pipeline:
    print(row)

Each stage processes one item at a time, keeping memory usage constant regardless of file size.

Coroutines in Modern Languages

Different languages expose coroutines with varying levels of abstraction:

# Python: async/await syntax
import asyncio

async def greet(name, delay):
    await asyncio.sleep(delay)
    return f"Hello, {name}!"

async def main():
    results = await asyncio.gather(
        greet("Alice", 1),
        greet("Bob", 0.5),
    )
    print(results)

asyncio.run(main())

// Kotlin: Structured concurrency with coroutine scopes
import kotlinx.coroutines.*

suspend fun greet(name: String, delay: Long): String {
    delay(delay)
    return "Hello, $name!"
}

fun main() = runBlocking {
    val results = listOf(
        async { greet("Alice", 1000) },
        async { greet("Bob", 500) },
    ).awaitAll()
    println(results)
}

// JavaScript: Promises + async/await
async function greet(name, delay) {
    await new Promise(resolve => setTimeout(resolve, delay));
    return `Hello, ${name}!`;
}

async function main() {
    const results = await Promise.all([
        greet("Alice", 1000),
        greet("Bob", 500),
    ]);
    console.log(results);
}

main();

Go’s goroutines are a hybrid: they’re stackful coroutines multiplexed onto OS threads by the Go runtime, giving you the programming model of coroutines with the preemption benefits of threads.

Pitfalls and Best Practices

Forgetting to await: The most common mistake. Calling an async function without await returns a coroutine object that never executes:

async def fetch_data():
    return "data"

async def broken():
    result = fetch_data()  # Bug! Returns coroutine, doesn't execute
    print(result)  # <coroutine object fetch_data at 0x...>

async def fixed():
    result = await fetch_data()  # Correct
    print(result)  # "data"

Blocking the event loop: Calling synchronous I/O or CPU-intensive code inside a coroutine blocks all other coroutines:

async def bad():
    time.sleep(5)  # Blocks everything!
    
async def good():
    await asyncio.sleep(5)  # Yields control while waiting

Coroutine leaks: Creating coroutines without awaiting them leaks resources. Use structured concurrency patterns (like Python’s TaskGroup or Kotlin’s coroutineScope) to ensure all coroutines complete.

When to choose what:

Coroutines: I/O-bound work, many concurrent connections, event-driven systems
Threads: CPU-bound work with shared memory, legacy blocking APIs
Processes: CPU-bound work needing true parallelism, isolation requirements

Coroutines aren’t a replacement for threads—they’re a different tool. Use them where cooperative scheduling’s predictability and efficiency outweigh its limitations.