Iterator Pattern in Python: __iter__ and __next__
The iterator pattern is one of the most frequently used behavioral design patterns, yet many Python developers use it daily without recognizing it. Every `for` loop, every list comprehension, and...
Key Insights
- The iterator pattern decouples traversal logic from data structures, enabling memory-efficient processing of sequences that may be infinite or too large to fit in memory.
- Python’s iterator protocol requires just two methods:
__iter__()returns the iterator object, and__next__()returns successive values until raisingStopIteration. - Generators provide a cleaner syntax for most iterator use cases, but class-based iterators remain essential when you need complex state management or reusable iteration logic.
Introduction to the Iterator Pattern
The iterator pattern is one of the most frequently used behavioral design patterns, yet many Python developers use it daily without recognizing it. Every for loop, every list comprehension, and every call to map() or filter() relies on this pattern under the hood.
At its core, the iterator pattern provides a standard way to traverse a collection without exposing its underlying structure. This abstraction matters because it decouples the “how” of traversal from the “what” of the data. Your code doesn’t need to know whether it’s iterating over a list, a database cursor, a file, or an infinite mathematical sequence—the interface remains identical.
Consider what happens when you write a simple for loop:
numbers = [1, 2, 3, 4, 5]
for num in numbers:
print(num)
Python translates this into something closer to:
numbers = [1, 2, 3, 4, 5]
iterator = iter(numbers) # Calls numbers.__iter__()
while True:
try:
num = next(iterator) # Calls iterator.__next__()
print(num)
except StopIteration:
break
This implicit protocol enables Python’s elegant iteration syntax while providing hooks for custom behavior.
Python’s Iterator Protocol
Python’s iterator protocol consists of two methods that any object can implement to become iterable.
The __iter__() method returns an iterator object. For iterables (like lists), this returns a separate iterator instance. For iterators themselves, this typically returns self.
The __next__() method returns the next value in the sequence. When no more values exist, it raises StopIteration to signal completion.
Understanding the distinction between iterables and iterators is crucial:
- Iterable: An object with an
__iter__()method that returns an iterator. Lists, tuples, and dictionaries are iterables. - Iterator: An object with both
__iter__()(returning itself) and__next__()methods. Iterators maintain traversal state.
Here’s a custom iterator that counts down from a starting number:
class CountDown:
def __init__(self, start: int):
self.current = start
def __iter__(self):
return self
def __next__(self) -> int:
if self.current < 0:
raise StopIteration
value = self.current
self.current -= 1
return value
# Usage
for num in CountDown(5):
print(num) # Prints: 5, 4, 3, 2, 1, 0
The StopIteration exception isn’t an error—it’s the protocol’s signaling mechanism. Python’s for loop catches this exception automatically and exits cleanly.
Building Custom Iterators
Real-world iterators often wrap external resources or implement complex traversal logic. Let’s build a practical example: an iterator that fetches paginated data from an API.
from typing import Iterator, Any
import requests
class PaginatedAPIIterator:
"""Iterates over paginated API responses, fetching pages on demand."""
def __init__(self, base_url: str, page_size: int = 100):
self.base_url = base_url
self.page_size = page_size
self.current_page = 0
self.current_items: list[Any] = []
self.item_index = 0
self.exhausted = False
def __iter__(self):
return self
def __next__(self) -> Any:
# If we've consumed all items in current page, fetch next page
if self.item_index >= len(self.current_items):
if self.exhausted:
raise StopIteration
self._fetch_next_page()
if not self.current_items:
raise StopIteration
item = self.current_items[self.item_index]
self.item_index += 1
return item
def _fetch_next_page(self) -> None:
response = requests.get(
self.base_url,
params={"page": self.current_page, "limit": self.page_size}
)
response.raise_for_status()
data = response.json()
self.current_items = data.get("items", [])
self.item_index = 0
self.current_page += 1
# Mark exhausted if we got fewer items than requested
if len(self.current_items) < self.page_size:
self.exhausted = True
# Usage
for user in PaginatedAPIIterator("https://api.example.com/users"):
process_user(user)
This iterator manages multiple pieces of state: the current page, position within the page, and whether more pages exist. The calling code remains blissfully unaware of pagination—it simply iterates over users.
Generators as Iterator Shorthand
While class-based iterators offer explicit control, Python’s generators provide a more concise syntax for most use cases. A generator function uses yield to produce values and automatically implements the iterator protocol.
Here’s the countdown example rewritten as a generator:
def countdown(start: int) -> Iterator[int]:
current = start
while current >= 0:
yield current
current -= 1
for num in countdown(5):
print(num) # Prints: 5, 4, 3, 2, 1, 0
The generator version is significantly shorter. When Python encounters yield, it suspends the function’s execution, saves its state, and returns the yielded value. The next call to __next__() resumes execution from where it left off.
The paginated API iterator becomes cleaner as a generator:
def paginated_api_iterator(base_url: str, page_size: int = 100) -> Iterator[Any]:
page = 0
while True:
response = requests.get(
base_url,
params={"page": page, "limit": page_size}
)
response.raise_for_status()
items = response.json().get("items", [])
if not items:
return
yield from items # Yields each item individually
if len(items) < page_size:
return
page += 1
The yield from statement delegates to another iterable, yielding each of its items. This eliminates the nested loop you’d otherwise need.
Choose class-based iterators when you need methods beyond __iter__ and __next__, when the iterator must be reusable, or when you’re implementing a protocol that other code will subclass. Use generators for everything else.
Advanced Patterns and Composition
Iterators shine when composed into data processing pipelines. Each stage transforms data lazily, processing one item at a time without materializing intermediate collections.
from typing import Iterator, Callable, TypeVar
T = TypeVar('T')
U = TypeVar('U')
def filter_iter(predicate: Callable[[T], bool], iterable: Iterator[T]) -> Iterator[T]:
for item in iterable:
if predicate(item):
yield item
def map_iter(transform: Callable[[T], U], iterable: Iterator[T]) -> Iterator[U]:
for item in iterable:
yield transform(item)
def take(n: int, iterable: Iterator[T]) -> Iterator[T]:
for i, item in enumerate(iterable):
if i >= n:
return
yield item
# Composable pipeline processing infinite sequence
def fibonacci() -> Iterator[int]:
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Get first 10 even Fibonacci numbers, squared
pipeline = take(
10,
map_iter(
lambda x: x ** 2,
filter_iter(
lambda x: x % 2 == 0,
fibonacci()
)
)
)
print(list(pipeline)) # [0, 4, 64, 576, 4624, 36864, 294912, 2359296, 18874368, 150994944]
This pipeline processes an infinite sequence without running forever because take() limits the output. Each value flows through the entire pipeline before the next is generated—no intermediate lists are created.
Common Pitfalls and Best Practices
Iterator Exhaustion: Iterators are single-use by default. Once exhausted, they’re empty forever.
numbers = (x for x in range(5)) # Generator expression
print(list(numbers)) # [0, 1, 2, 3, 4]
print(list(numbers)) # [] - exhausted!
If you need multiple passes, either convert to a list first or create an iterable class that returns fresh iterators:
class ReusableRange:
def __init__(self, stop: int):
self.stop = stop
def __iter__(self) -> Iterator[int]:
# Returns a NEW iterator each time
return iter(range(self.stop))
numbers = ReusableRange(5)
print(list(numbers)) # [0, 1, 2, 3, 4]
print(list(numbers)) # [0, 1, 2, 3, 4] - works again!
Thread Safety: Iterators with mutable state aren’t thread-safe by default. If multiple threads share an iterator, use locks or create separate iterators per thread.
When to Use Iterators vs. Lists: Use iterators when processing large datasets that won’t fit in memory, when you might not need all values, or when values are expensive to compute. Use lists when you need random access, multiple passes, or the length upfront.
Conclusion
The iterator pattern provides a powerful abstraction for sequential data access. Python’s protocol—__iter__() and __next__()—is minimal yet flexible enough to handle everything from simple ranges to paginated API calls to infinite mathematical sequences.
Implement custom iterators when you need lazy evaluation of expensive computations, memory-efficient processing of large datasets, clean abstraction over complex data sources, or composable data transformation pipelines.
Start with generators for their simplicity. Graduate to class-based iterators when you need reusability, complex state, or additional methods. Either way, you’re leveraging one of Python’s most elegant architectural patterns.