Python - Iterators vs Iterables

Python's iteration mechanism relies on two magic methods: `__iter__()` and `__next__()`. An **iterable** is any object that implements `__iter__()`, which returns an iterator. An **iterator** is an...

Key Insights

  • Iterables are objects that implement __iter__() and return an iterator, while iterators implement both __iter__() and __next__() to track iteration state
  • Every iterator is an iterable, but not every iterable is an iterator—this distinction matters for memory efficiency and reusability
  • Understanding the iterator protocol prevents common bugs like iterator exhaustion and enables building custom iteration patterns for domain-specific data structures

The Iterator Protocol Explained

Python’s iteration mechanism relies on two magic methods: __iter__() and __next__(). An iterable is any object that implements __iter__(), which returns an iterator. An iterator is an object that implements both __iter__() (returning itself) and __next__() (returning the next value or raising StopIteration).

# Iterable example: list
numbers = [1, 2, 3]
print(hasattr(numbers, '__iter__'))  # True
print(hasattr(numbers, '__next__'))  # False

# Get an iterator from the iterable
iterator = iter(numbers)
print(hasattr(iterator, '__iter__'))  # True
print(hasattr(iterator, '__next__'))  # True

# Manual iteration
print(next(iterator))  # 1
print(next(iterator))  # 2
print(next(iterator))  # 3
# next(iterator)  # Raises StopIteration

When you use a for loop, Python calls iter() on the object to get an iterator, then repeatedly calls next() until StopIteration is raised:

# What happens behind the scenes
numbers = [1, 2, 3]
iterator = iter(numbers)
while True:
    try:
        item = next(iterator)
        print(item)
    except StopIteration:
        break

Iterator Exhaustion: A Critical Difference

Iterators maintain state and can only be traversed once. Iterables can be iterated multiple times because each call to iter() returns a fresh iterator.

# Iterable: can iterate multiple times
numbers_list = [1, 2, 3]
print(sum(numbers_list))  # 6
print(sum(numbers_list))  # 6 - works fine

# Iterator: single-use
numbers_iter = iter([1, 2, 3])
print(sum(numbers_iter))  # 6
print(sum(numbers_iter))  # 0 - exhausted!

# File objects are iterators
with open('data.txt', 'w') as f:
    f.write('line1\nline2\nline3')

with open('data.txt') as f:
    lines1 = list(f)
    lines2 = list(f)  # Empty! Iterator exhausted
    print(len(lines1))  # 3
    print(len(lines2))  # 0

This behavior catches developers off-guard when passing iterators to multiple functions:

def process_data(data):
    total = sum(data)
    count = len(list(data))  # Problem if data is an iterator
    return total / count

# Works with iterables
result = process_data([1, 2, 3, 4])  # 2.5

# Fails with iterators
iterator = iter([1, 2, 3, 4])
result = process_data(iterator)  # Division by zero!

Building Custom Iterators

Custom iterators enable lazy evaluation and memory-efficient data processing. Implement __iter__() and __next__() to create your own:

class Countdown:
    def __init__(self, start):
        self.current = start
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        self.current -= 1
        return self.current + 1

counter = Countdown(3)
for num in counter:
    print(num)  # 3, 2, 1

# Iterator is exhausted
for num in counter:
    print(num)  # Nothing prints

For reusable iteration, separate the iterable and iterator classes:

class CountdownIterator:
    def __init__(self, start):
        self.current = start
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current <= 0:
            raise StopIteration
        self.current -= 1
        return self.current + 1

class Countdown:
    def __init__(self, start):
        self.start = start
    
    def __iter__(self):
        return CountdownIterator(self.start)

counter = Countdown(3)
print(list(counter))  # [3, 2, 1]
print(list(counter))  # [3, 2, 1] - works again!

Generator Functions: Iterators Made Simple

Generator functions provide syntactic sugar for creating iterators without boilerplate:

def countdown(start):
    while start > 0:
        yield start
        start -= 1

counter = countdown(3)
print(next(counter))  # 3
print(next(counter))  # 2
print(list(counter))  # [1]

# Practical example: chunked file reading
def read_chunks(file_path, chunk_size=1024):
    with open(file_path, 'rb') as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            yield chunk

for chunk in read_chunks('large_file.dat'):
    process(chunk)  # Memory-efficient processing

Generator expressions offer even more concise syntax for simple cases:

# Generator expression (iterator)
squares_gen = (x**2 for x in range(1000000))
print(next(squares_gen))  # 0
print(next(squares_gen))  # 1

# List comprehension (iterable, loads all into memory)
squares_list = [x**2 for x in range(1000000)]

Real-World Application: Database Result Sets

Understanding iterators is crucial when working with database cursors, which return iterators to avoid loading entire result sets into memory:

import sqlite3

conn = sqlite3.connect('database.db')
cursor = conn.cursor()
cursor.execute('SELECT * FROM large_table')

# Wrong: loads everything into memory
all_rows = cursor.fetchall()
for row in all_rows:
    process(row)

# Right: iterate lazily
cursor.execute('SELECT * FROM large_table')
for row in cursor:  # cursor is an iterator
    process(row)
    if should_stop():
        break  # Can stop early without fetching remaining rows

Custom iterator for batched database reads:

class BatchedCursor:
    def __init__(self, cursor, batch_size=1000):
        self.cursor = cursor
        self.batch_size = batch_size
    
    def __iter__(self):
        return self
    
    def __next__(self):
        rows = self.cursor.fetchmany(self.batch_size)
        if not rows:
            raise StopIteration
        return rows

cursor.execute('SELECT * FROM large_table')
batched = BatchedCursor(cursor, batch_size=500)

for batch in batched:
    process_batch(batch)  # Process 500 rows at a time

Performance Implications

Iterators enable lazy evaluation, computing values on-demand rather than upfront:

import time

def slow_squares(n):
    """Iterable that returns a fresh iterator each time"""
    class Iterator:
        def __init__(self):
            self.i = 0
        def __iter__(self):
            return self
        def __next__(self):
            if self.i >= n:
                raise StopIteration
            time.sleep(0.1)  # Simulate expensive computation
            result = self.i ** 2
            self.i += 1
            return result
    return Iterator

# Only computes what's needed
gen = slow_squares(100)
first_five = []
for i, val in enumerate(gen()):
    first_five.append(val)
    if i == 4:
        break  # Only 5 computations, not 100

print(first_five)  # [0, 1, 4, 9, 16]

Memory comparison:

import sys

# List (iterable): stores all values
numbers_list = [i for i in range(1000000)]
print(sys.getsizeof(numbers_list))  # ~8MB

# Generator (iterator): stores only state
numbers_gen = (i for i in range(1000000))
print(sys.getsizeof(numbers_gen))  # ~200 bytes

Common Pitfalls and Solutions

Pitfall 1: Passing iterators to functions expecting iterables

def analyze(data):
    mean = sum(data) / len(list(data))  # Bug if data is iterator
    return mean

# Solution: convert to list or use itertools.tee
from itertools import tee

def analyze(data):
    data1, data2 = tee(data, 2)
    total = sum(data1)
    count = sum(1 for _ in data2)
    return total / count

Pitfall 2: Modifying collections during iteration

# Wrong
numbers = [1, 2, 3, 4, 5]
for num in numbers:
    if num % 2 == 0:
        numbers.remove(num)  # Skips elements!

# Right: iterate over a copy
numbers = [1, 2, 3, 4, 5]
for num in numbers[:]:
    if num % 2 == 0:
        numbers.remove(num)

Pitfall 3: Not handling StopIteration in manual iteration

# Robust manual iteration
iterator = iter([1, 2, 3])
while True:
    try:
        value = next(iterator)
        process(value)
    except StopIteration:
        break

# Or use sentinel value
iterator = iter([1, 2, 3])
sentinel = object()
while (value := next(iterator, sentinel)) is not sentinel:
    process(value)

The iterator protocol forms the foundation of Python’s iteration model. Mastering the distinction between iterators and iterables enables you to write memory-efficient code, build custom iteration patterns, and avoid subtle bugs in production systems.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.