Python Protocols: Structural Subtyping Explained
Python has always embraced duck typing: 'If it walks like a duck and quacks like a duck, it's a duck.' This works beautifully at runtime but leaves static type checkers in the dark. Traditional...
Key Insights
- Protocols enable structural subtyping in Python, allowing type checkers to verify compatibility based on shape rather than inheritance, bridging the gap between duck typing at runtime and static type safety
- Unlike Abstract Base Classes that require explicit inheritance, Protocols work implicitly—any class with matching methods and attributes satisfies the Protocol without modification
- Use Protocols for library interfaces and flexible APIs where you don’t control the implementations; reserve ABCs for when you need explicit opt-in behavior or want to provide shared implementation logic
Introduction to Structural vs. Nominal Typing
Python has always embraced duck typing: “If it walks like a duck and quacks like a duck, it’s a duck.” This works beautifully at runtime but leaves static type checkers in the dark. Traditional inheritance provides nominal typing—a class is compatible only if it explicitly inherits from a base class. Protocols, introduced in PEP 544, bring structural subtyping to Python’s type system, checking compatibility based on structure rather than inheritance hierarchy.
Here’s the difference in practice:
# Nominal typing with inheritance
from abc import ABC, abstractmethod
class DrawableBase(ABC):
@abstractmethod
def draw(self) -> None:
pass
class Circle(DrawableBase): # Must explicitly inherit
def draw(self) -> None:
print("Drawing circle")
# Structural typing with Protocol
from typing import Protocol
class Drawable(Protocol):
def draw(self) -> None:
...
class Square: # No inheritance needed
def draw(self) -> None:
print("Drawing square")
def render(obj: Drawable) -> None:
obj.draw()
render(Square()) # Type checker approves—Square has draw()
The Protocol approach is more flexible. You can use existing classes that happen to have the right shape without modifying them to inherit from anything.
Basic Protocol Definition and Usage
Defining a Protocol is straightforward. Use typing.Protocol as a base class and declare the methods or attributes you expect:
from typing import Protocol
class Drawable(Protocol):
def draw(self) -> None:
"""Draw the object."""
...
def get_area(self) -> float:
"""Return the area."""
...
# These classes satisfy Drawable without inheritance
class Circle:
def __init__(self, radius: float):
self.radius = radius
def draw(self) -> None:
print(f"Circle with radius {self.radius}")
def get_area(self) -> float:
return 3.14159 * self.radius ** 2
class Rectangle:
def __init__(self, width: float, height: float):
self.width = width
self.height = height
def draw(self) -> None:
print(f"Rectangle {self.width}x{self.height}")
def get_area(self) -> float:
return self.width * self.height
def display_shape(shape: Drawable) -> None:
shape.draw()
print(f"Area: {shape.get_area()}")
# Both work perfectly with type checkers
display_shape(Circle(5))
display_shape(Rectangle(4, 6))
At runtime, Python doesn’t enforce Protocols—it’s purely for static analysis. Type checkers like mypy or pyright verify that objects passed to functions have the required methods with compatible signatures.
Protocol Features and Attributes
Protocols support more than just methods. You can specify properties, class variables, and even create generic Protocols:
from typing import Protocol, TypeVar
class DataSource(Protocol):
name: str # Instance attribute
@property
def connection_string(self) -> str:
"""Connection details."""
...
def fetch(self, query: str) -> list[dict]:
"""Execute query and return results."""
...
# Generic Protocol
T = TypeVar('T')
class Container(Protocol[T]):
def add(self, item: T) -> None:
...
def get(self) -> T:
...
class IntBox:
def __init__(self):
self._value: int = 0
def add(self, item: int) -> None:
self._value = item
def get(self) -> int:
return self._value
def process_container(container: Container[int]) -> int:
container.add(42)
return container.get()
process_container(IntBox()) # Type safe
For runtime checking, use the @runtime_checkable decorator:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Closeable(Protocol):
def close(self) -> None:
...
class File:
def close(self) -> None:
print("Closing file")
class Connection:
def close(self) -> None:
print("Closing connection")
obj = File()
if isinstance(obj, Closeable): # Works at runtime
obj.close()
Note that isinstance() checks are shallow—they only verify method names exist, not signatures. It’s primarily useful for runtime validation, not a replacement for static type checking.
Practical Use Cases
Protocols excel in scenarios where you need flexibility without coupling. Consider a file-like interface that works with actual files, in-memory buffers, or network streams:
from typing import Protocol
class SupportsRead(Protocol):
def read(self, size: int = -1) -> bytes:
...
def seek(self, offset: int) -> int:
...
def process_data(source: SupportsRead) -> dict:
"""Process data from any readable source."""
source.seek(0)
header = source.read(512)
# Process header...
return {"header": header}
# Works with files
with open("data.bin", "rb") as f:
result = process_data(f)
# Works with BytesIO
from io import BytesIO
buffer = BytesIO(b"some data")
result = process_data(buffer)
# Works with custom implementations
class NetworkStream:
def read(self, size: int = -1) -> bytes:
# Fetch from network
return b"network data"
def seek(self, offset: int) -> int:
# Seek in stream
return offset
result = process_data(NetworkStream())
The repository pattern benefits significantly from Protocols:
from typing import Protocol, Optional
class UserRepository(Protocol):
def get_by_id(self, user_id: int) -> Optional[dict]:
...
def save(self, user: dict) -> None:
...
def delete(self, user_id: int) -> bool:
...
class PostgresUserRepository:
def __init__(self, connection_string: str):
self.conn = connection_string
def get_by_id(self, user_id: int) -> Optional[dict]:
# PostgreSQL implementation
return {"id": user_id, "name": "John"}
def save(self, user: dict) -> None:
# Save to PostgreSQL
pass
def delete(self, user_id: int) -> bool:
# Delete from PostgreSQL
return True
class InMemoryUserRepository:
def __init__(self):
self.users: dict[int, dict] = {}
def get_by_id(self, user_id: int) -> Optional[dict]:
return self.users.get(user_id)
def save(self, user: dict) -> None:
self.users[user["id"]] = user
def delete(self, user_id: int) -> bool:
return self.users.pop(user_id, None) is not None
class UserService:
def __init__(self, repository: UserRepository):
self.repository = repository
def activate_user(self, user_id: int) -> None:
user = self.repository.get_by_id(user_id)
if user:
user["active"] = True
self.repository.save(user)
# Swap implementations easily for testing or different environments
service = UserService(InMemoryUserRepository())
service = UserService(PostgresUserRepository("postgresql://..."))
Protocols vs. ABCs: When to Use Each
Both Protocols and Abstract Base Classes define interfaces, but they serve different purposes:
Use Protocols when:
- You want to define interfaces for code you don’t control
- You need maximum flexibility and don’t want to force inheritance
- You’re designing library APIs where users bring their own implementations
- You want to leverage existing classes that already have the right shape
Use ABCs when:
- You want explicit opt-in behavior—classes must consciously implement your interface
- You need to provide shared implementation through mixin methods
- You want runtime registration of implementations
- You’re building a framework where inheritance hierarchies make sense
Here’s a refactoring example:
# ABC approach - requires inheritance
from abc import ABC, abstractmethod
class PaymentProcessorABC(ABC):
@abstractmethod
def process_payment(self, amount: float) -> str:
pass
def log_transaction(self, transaction_id: str) -> None:
print(f"Logged: {transaction_id}") # Shared implementation
# Protocol approach - structural
from typing import Protocol
class PaymentProcessor(Protocol):
def process_payment(self, amount: float) -> str:
...
# With Protocol, existing classes work without modification
class StripeAPI: # From third-party library
def process_payment(self, amount: float) -> str:
return f"stripe_{amount}"
class PayPalClient: # Another third-party library
def process_payment(self, amount: float) -> str:
return f"paypal_{amount}"
def charge_customer(processor: PaymentProcessor, amount: float) -> None:
transaction_id = processor.process_payment(amount)
print(f"Charged: {transaction_id}")
# Both work without inheriting from anything
charge_customer(StripeAPI(), 99.99)
charge_customer(PayPalClient(), 149.99)
Common Pitfalls and Best Practices
Pitfall 1: Incorrect method signatures
Protocols require exact signature matches. Variance matters:
from typing import Protocol
class Processor(Protocol):
def process(self, data: str) -> str:
...
# This won't match - return type is incompatible
class BadProcessor:
def process(self, data: str) -> int: # Wrong return type
return 42
# This won't match either - parameter type is too specific
class AlsoBad:
def process(self, data: int) -> str: # Wrong parameter type
return "done"
Pitfall 2: Mixing runtime and static checking
@runtime_checkable only checks for method existence, not signatures:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Adder(Protocol):
def add(self, x: int, y: int) -> int:
...
class WrongAdder:
def add(self, x: str) -> str: # Wrong signature
return x
obj = WrongAdder()
isinstance(obj, Adder) # True! Only checks method name exists
Best Practice: Keep Protocols focused
Define small, cohesive Protocols rather than large interfaces:
# Good - focused Protocols
class Readable(Protocol):
def read(self) -> bytes:
...
class Writable(Protocol):
def write(self, data: bytes) -> int:
...
class Seekable(Protocol):
def seek(self, offset: int) -> int:
...
# Combine when needed
from typing import Union
ReadWriteFile = Readable & Writable # Intersection type (Python 3.10+)
# Or use Union for alternatives
FileOrBuffer = Union[Readable, Writable]
Best Practice: Document your Protocols
Since Protocols define contracts, clear documentation is essential:
class Serializable(Protocol):
"""Objects that can be serialized to and from dictionaries.
Implementations must provide both methods with compatible signatures.
The to_dict method should produce output that from_dict can consume.
"""
def to_dict(self) -> dict:
"""Convert object to dictionary representation."""
...
@classmethod
def from_dict(cls, data: dict) -> 'Serializable':
"""Create instance from dictionary representation."""
...
Protocols represent Python’s pragmatic approach to type safety—embracing duck typing while providing static guarantees. They let you write flexible, decoupled code that still benefits from type checking. Use them liberally in library interfaces, and your users will thank you for not forcing inheritance hierarchies on their code.