Prototype Pattern in Python: copy and deepcopy
The Prototype pattern is a creational design pattern that sidesteps the traditional instantiation process. Instead of calling a constructor and running through potentially expensive initialization...
Key Insights
- The Prototype pattern creates objects by cloning existing instances, bypassing expensive initialization—Python’s
copymodule provides the foundation, but understanding shallow vs. deep copy semantics is critical to avoiding subtle bugs. - Implement
__copy__and__deepcopy__magic methods to control exactly how your objects clone themselves, especially when dealing with resources like database connections or file handles that shouldn’t be duplicated. - Prototype shines for complex object graphs and configuration templates, but don’t reach for it when simple factory functions or dataclasses with
replace()would suffice.
Introduction to the Prototype Pattern
The Prototype pattern is a creational design pattern that sidesteps the traditional instantiation process. Instead of calling a constructor and running through potentially expensive initialization logic, you clone an existing object that’s already configured the way you need it.
This pattern earns its keep in specific scenarios: when object creation is computationally expensive (think parsing large configuration files or establishing network connections), when objects require complex multi-step initialization, or when you need many variations of a base configuration without repeating setup code.
Python makes this pattern particularly accessible through its built-in copy module. But the simplicity is deceptive—misunderstanding the difference between shallow and deep copying leads to bugs that are maddeningly difficult to track down.
Python’s Copy Module Fundamentals
Before diving into the Prototype pattern implementation, you need to understand what Python actually does when you “copy” an object. There are three distinct operations that developers often conflate:
Assignment creates another reference to the same object. No copying occurs.
Shallow copy creates a new object but populates it with references to the original’s nested objects.
Deep copy creates a new object and recursively copies all nested objects.
import copy
class Config:
def __init__(self, name, settings):
self.name = name
self.settings = settings # mutable dict
original = Config("production", {"debug": False, "timeout": 30})
# Assignment - same object
assigned = original
print(f"Assignment: {id(original) == id(assigned)}") # True
# Shallow copy - new object, shared internals
shallow = copy.copy(original)
print(f"Shallow copy same object: {id(original) == id(shallow)}") # False
print(f"Shallow copy same settings dict: {id(original.settings) == id(shallow.settings)}") # True
# Deep copy - completely independent
deep = copy.deepcopy(original)
print(f"Deep copy same settings dict: {id(original.settings) == id(deep.settings)}") # False
The id() function reveals object identity in memory. Assignment gives you two names for one object. Shallow copy gives you two objects sharing internal state. Deep copy gives you truly independent objects.
Shallow Copy with copy.copy()
Shallow copying creates a new container object but doesn’t recurse into nested structures. The new object gets references to the same nested objects as the original. This behavior is intentional and often desirable—it’s faster and uses less memory.
The danger emerges when you mutate nested mutable objects:
import copy
class GameCharacter:
def __init__(self, name, stats, inventory):
self.name = name
self.stats = stats # dict
self.inventory = inventory # list
warrior = GameCharacter(
name="Thorin",
stats={"strength": 18, "dexterity": 12},
inventory=["sword", "shield"]
)
# Create a "new" character via shallow copy
warrior_clone = copy.copy(warrior)
warrior_clone.name = "Thorin II" # Safe - strings are immutable
# This modifies BOTH characters
warrior_clone.inventory.append("potion")
warrior_clone.stats["strength"] = 20
print(f"Original inventory: {warrior.inventory}") # ['sword', 'shield', 'potion']
print(f"Original strength: {warrior.stats['strength']}") # 20
The clone’s name attribute is independent because reassigning it creates a new string reference. But inventory.append() and stats["strength"] = 20 mutate the shared objects in place. Both characters now have the potion and boosted strength.
This isn’t a bug in Python—it’s the defined behavior of shallow copy. Use it when you want shared nested state or when all nested objects are immutable.
Deep Copy with copy.deepcopy()
Deep copy recursively clones every nested object, creating a completely independent object graph. Modifications to the copy never affect the original:
import copy
class GameCharacter:
def __init__(self, name, stats, inventory):
self.name = name
self.stats = stats
self.inventory = inventory
warrior = GameCharacter(
name="Thorin",
stats={"strength": 18, "dexterity": 12},
inventory=["sword", "shield"]
)
# Deep copy creates fully independent clone
warrior_clone = copy.deepcopy(warrior)
warrior_clone.name = "Thorin II"
warrior_clone.inventory.append("potion")
warrior_clone.stats["strength"] = 20
print(f"Original inventory: {warrior.inventory}") # ['sword', 'shield']
print(f"Clone inventory: {warrior_clone.inventory}") # ['sword', 'shield', 'potion']
print(f"Original strength: {warrior.stats['strength']}") # 18
print(f"Clone strength: {warrior_clone.stats['strength']}") # 20
The tradeoff is performance. Deep copy traverses the entire object graph, allocating memory for every nested structure. For objects with large nested collections or deep hierarchies, this cost adds up. Profile before assuming it’s negligible.
Implementing the Prototype Pattern
Python’s magic methods __copy__ and __deepcopy__ let you customize cloning behavior. This is where the Prototype pattern becomes explicit rather than incidental:
import copy
from abc import ABC, abstractmethod
from typing import TypeVar, Dict, Any
T = TypeVar('T', bound='Prototype')
class Prototype(ABC):
"""Abstract prototype defining the cloning interface."""
@abstractmethod
def clone(self: T, deep: bool = True) -> T:
"""Create a copy of this object."""
pass
class DocumentTemplate(Prototype):
def __init__(self, title: str, sections: list, metadata: dict):
self.title = title
self.sections = sections
self.metadata = metadata
self._created_at = None # Set on first use, not cloning
def clone(self, deep: bool = True) -> 'DocumentTemplate':
if deep:
return copy.deepcopy(self)
return copy.copy(self)
def __copy__(self):
# Shallow copy with explicit attribute handling
cls = self.__class__
result = cls.__new__(cls)
result.__dict__.update(self.__dict__)
return result
def __deepcopy__(self, memo: dict):
# Deep copy with full control
cls = self.__class__
result = cls.__new__(cls)
memo[id(self)] = result # Handle circular references
for key, value in self.__dict__.items():
setattr(result, key, copy.deepcopy(value, memo))
return result
# Usage
report_template = DocumentTemplate(
title="Quarterly Report",
sections=["Executive Summary", "Financials", "Outlook"],
metadata={"author": "template", "version": 1}
)
q1_report = report_template.clone()
q1_report.title = "Q1 2024 Report"
q1_report.metadata["author"] = "Jane Smith"
print(f"Template title: {report_template.title}") # Quarterly Report
print(f"Q1 title: {q1_report.title}") # Q1 2024 Report
The memo dictionary in __deepcopy__ is crucial—it tracks already-copied objects to handle circular references. Always register self in the memo before copying attributes.
Handling Edge Cases
Real-world objects often contain attributes that shouldn’t be copied: file handles, database connections, thread locks, or cached computed values. Custom __deepcopy__ implementations handle these gracefully:
import copy
import sqlite3
from typing import Optional
class DataProcessor:
def __init__(self, config: dict, db_path: str):
self.config = config
self.db_path = db_path
self._connection: Optional[sqlite3.Connection] = None
self._cache: dict = {}
@property
def connection(self) -> sqlite3.Connection:
if self._connection is None:
self._connection = sqlite3.connect(self.db_path)
return self._connection
def __deepcopy__(self, memo: dict):
cls = self.__class__
result = cls.__new__(cls)
memo[id(self)] = result
for key, value in self.__dict__.items():
if key == '_connection':
# Don't copy the connection - clone gets its own
setattr(result, key, None)
elif key == '_cache':
# Start with empty cache
setattr(result, key, {})
else:
# Deep copy everything else
setattr(result, key, copy.deepcopy(value, memo))
return result
def __del__(self):
if self._connection:
self._connection.close()
# Usage
processor = DataProcessor(
config={"batch_size": 100, "retry_count": 3},
db_path=":memory:"
)
# Clone gets independent config but fresh connection/cache
cloned = copy.deepcopy(processor)
cloned.config["batch_size"] = 200
print(f"Original batch size: {processor.config['batch_size']}") # 100
print(f"Clone batch size: {cloned.config['batch_size']}") # 200
The clone inherits the configuration but gets a fresh database connection (lazily initialized) and empty cache. This pattern prevents resource leaks and ensures clones don’t share stateful resources.
Practical Use Cases and Alternatives
The Prototype pattern fits naturally into several scenarios:
Configuration templates: Create a base configuration object, then clone and customize for different environments or tenants.
Undo functionality: Before each mutation, clone the current state. Undo by restoring the previous clone.
Object pooling: Pre-create and clone configured objects rather than reinitializing from scratch.
A prototype registry centralizes template management:
import copy
from typing import Dict, TypeVar, Generic
T = TypeVar('T')
class PrototypeRegistry(Generic[T]):
def __init__(self):
self._prototypes: Dict[str, T] = {}
def register(self, name: str, prototype: T) -> None:
self._prototypes[name] = prototype
def unregister(self, name: str) -> None:
del self._prototypes[name]
def clone(self, name: str, deep: bool = True) -> T:
prototype = self._prototypes.get(name)
if prototype is None:
raise KeyError(f"No prototype registered with name: {name}")
if deep:
return copy.deepcopy(prototype)
return copy.copy(prototype)
# Usage
registry: PrototypeRegistry[DocumentTemplate] = PrototypeRegistry()
registry.register("report", report_template)
registry.register("memo", DocumentTemplate("Memo", ["Body"], {}))
new_report = registry.clone("report")
new_memo = registry.clone("memo")
When to avoid Prototype: If your objects are simple and cheap to construct, the pattern adds unnecessary complexity. Python’s dataclasses with replace() or attrs with evolve() often provide cleaner solutions for simple cases. Factory functions work better when you need to vary construction logic rather than clone existing state.
The Prototype pattern is a tool, not a mandate. Use it when cloning genuinely simplifies your code and improves performance. Reach for simpler alternatives when they suffice.