Composite Pattern in Python: File System Example

The Composite pattern is a structural design pattern that lets you compose objects into tree structures and then work with those structures as if they were individual objects. The core insight is...

Key Insights

  • The Composite pattern lets you treat individual objects and groups of objects uniformly through a shared interface, eliminating scattered type-checking logic
  • File systems are the canonical example because directories contain both files and other directories, yet both need common operations like calculating size or searching
  • The pattern trades some type safety for flexibility—you can add children to any component, so runtime checks or careful design prevent invalid operations

Introduction to the Composite Pattern

The Composite pattern is a structural design pattern that lets you compose objects into tree structures and then work with those structures as if they were individual objects. The core insight is simple: when you have a hierarchy where containers hold both primitive elements and other containers, give them all the same interface.

This pattern shows up constantly in software. GUI frameworks use it for nested widgets. Document processors use it for paragraphs containing text and images. Organization charts use it for departments containing employees and sub-departments. But the clearest example—and the one we’ll build today—is a file system.

A directory can contain files. It can also contain other directories. Yet when you ask “what’s the total size of this folder?” you don’t want to care about that distinction. You want to call get_size() and get an answer.

The Problem: Managing Hierarchical Data

Let’s start with the naive approach. You have files and directories, and you need to calculate sizes, search for items, and display the structure. Here’s what happens without the Composite pattern:

class File:
    def __init__(self, name: str, size: int):
        self.name = name
        self.size = size

class Directory:
    def __init__(self, name: str):
        self.name = name
        self.contents: list = []
    
    def add(self, item):
        self.contents.append(item)

def calculate_size(item) -> int:
    if isinstance(item, File):
        return item.size
    elif isinstance(item, Directory):
        total = 0
        for child in item.contents:
            if isinstance(child, File):
                total += child.size
            elif isinstance(child, Directory):
                total += calculate_size(child)
        return total
    else:
        raise TypeError(f"Unknown type: {type(item)}")

def search(item, query: str) -> list:
    results = []
    if isinstance(item, File):
        if query in item.name:
            results.append(item)
    elif isinstance(item, Directory):
        if query in item.name:
            results.append(item)
        for child in item.contents:
            results.extend(search(child, query))
    return results

This works, but it’s ugly. Every operation requires type checking. The logic for traversing the hierarchy is duplicated. Adding a new type (say, a symlink) means updating every function. And the operations live outside the classes, violating basic object-oriented principles.

The Composite pattern eliminates this mess by pushing the behavior into the objects themselves behind a common interface.

Core Components of the Pattern

The Composite pattern has three participants:

  1. Component: The abstract base class or interface that declares operations common to both simple and complex objects
  2. Leaf: A primitive object with no children (files in our case)
  3. Composite: A container that holds children and implements operations by delegating to those children (directories)

Here’s our abstract base:

from abc import ABC, abstractmethod
from typing import Iterator, Optional

class FileSystemComponent(ABC):
    def __init__(self, name: str):
        self.name = name
        self.parent: Optional['FileSystemComponent'] = None
    
    @abstractmethod
    def get_size(self) -> int:
        """Return size in bytes."""
        pass
    
    @abstractmethod
    def display(self, indent: int = 0) -> str:
        """Return a string representation with proper indentation."""
        pass
    
    @abstractmethod
    def search(self, query: str) -> Iterator['FileSystemComponent']:
        """Yield all components matching the query."""
        pass
    
    def get_path(self) -> str:
        """Return the full path to this component."""
        if self.parent is None:
            return self.name
        return f"{self.parent.get_path()}/{self.name}"
    
    # These methods only make sense for composites, but we define them here
    # with default implementations that raise errors for leaves
    def add(self, component: 'FileSystemComponent') -> None:
        raise NotImplementedError("Cannot add to a leaf component")
    
    def remove(self, component: 'FileSystemComponent') -> None:
        raise NotImplementedError("Cannot remove from a leaf component")

Notice the add() and remove() methods on the base class. This is a design decision with trade-offs. By putting them here with default implementations that raise errors, we maintain a uniform interface. The alternative—putting them only on Directory—is more type-safe but forces clients to know which type they’re dealing with.

Implementing the File System

Now let’s build the concrete classes. The File class is straightforward—it’s a leaf with no children:

class File(FileSystemComponent):
    def __init__(self, name: str, size: int):
        super().__init__(name)
        self._size = size
    
    def get_size(self) -> int:
        return self._size
    
    def display(self, indent: int = 0) -> str:
        return " " * indent + f"📄 {self.name} ({self._size} bytes)"
    
    def search(self, query: str) -> Iterator[FileSystemComponent]:
        if query.lower() in self.name.lower():
            yield self

The Directory class is where the pattern shines. It implements operations by delegating to its children:

class Directory(FileSystemComponent):
    def __init__(self, name: str):
        super().__init__(name)
        self._children: list[FileSystemComponent] = []
    
    def get_size(self) -> int:
        # Recursive: sum of all children's sizes
        return sum(child.get_size() for child in self._children)
    
    def display(self, indent: int = 0) -> str:
        lines = [" " * indent + f"📁 {self.name}/"]
        for child in self._children:
            lines.append(child.display(indent + 2))
        return "\n".join(lines)
    
    def search(self, query: str) -> Iterator[FileSystemComponent]:
        # Check self first
        if query.lower() in self.name.lower():
            yield self
        # Then delegate to children
        for child in self._children:
            yield from child.search(query)
    
    def add(self, component: FileSystemComponent) -> None:
        self._children.append(component)
        component.parent = self
    
    def remove(self, component: FileSystemComponent) -> None:
        self._children.remove(component)
        component.parent = None
    
    def __iter__(self) -> Iterator[FileSystemComponent]:
        return iter(self._children)
    
    def __len__(self) -> int:
        return len(self._children)

The key insight is in get_size(). A directory’s size is the sum of its children’s sizes. Each child might be a file (returning its own size) or another directory (recursively computing its subtree’s size). The caller doesn’t care. They just call get_size().

Putting It Together: Usage Examples

Let’s build a realistic directory structure and see the pattern in action:

def create_sample_filesystem() -> Directory:
    # Create root
    root = Directory("project")
    
    # Source directory
    src = Directory("src")
    src.add(File("main.py", 2048))
    src.add(File("utils.py", 1024))
    
    # Nested module
    models = Directory("models")
    models.add(File("user.py", 512))
    models.add(File("product.py", 768))
    models.add(File("__init__.py", 64))
    src.add(models)
    
    # Tests directory
    tests = Directory("tests")
    tests.add(File("test_main.py", 1536))
    tests.add(File("test_utils.py", 1024))
    
    # Config files at root
    root.add(src)
    root.add(tests)
    root.add(File("README.md", 4096))
    root.add(File("pyproject.toml", 512))
    
    return root

# Build the filesystem
fs = create_sample_filesystem()

# Display the entire tree
print(fs.display())
# Output:
# 📁 project/
#   📁 src/
#     📄 main.py (2048 bytes)
#     📄 utils.py (1024 bytes)
#     📁 models/
#       📄 user.py (512 bytes)
#       📄 product.py (768 bytes)
#       📄 __init__.py (64 bytes)
#   📁 tests/
#     📄 test_main.py (1536 bytes)
#     📄 test_utils.py (1024 bytes)
#   📄 README.md (4096 bytes)
#   📄 pyproject.toml (512 bytes)

# Calculate total size - works on any component
print(f"Total project size: {fs.get_size()} bytes")  # 11584 bytes

# Search works uniformly across the tree
print("\nFiles containing 'test':")
for item in fs.search("test"):
    print(f"  {item.get_path()}")
# Output:
#   project/tests
#   project/tests/test_main.py
#   project/tests/test_utils.py

# Get size of just the src directory
src_dir = next(child for child in fs if child.name == "src")
print(f"\nSource code size: {src_dir.get_size()} bytes")  # 4416 bytes

The beauty is that every operation works identically whether you call it on a single file, a directory, or the entire filesystem. No type checking. No special cases. Just polymorphism doing its job.

Trade-offs and When to Use

The Composite pattern isn’t free. Here are the trade-offs:

Benefits:

  • Uniform treatment of individual objects and compositions
  • Easy to add new component types without changing existing code
  • Client code stays simple—no type checking or special cases
  • Recursive operations become trivial to implement

Drawbacks:

  • Can make designs overly general when you don’t need that flexibility
  • Type safety suffers—nothing prevents you from trying to add a file to another file at compile time
  • The shared interface might include methods that don’t make sense for all types (like add() on a file)

When to use it:

  • You have a tree-like hierarchical structure
  • You want clients to treat leaves and composites uniformly
  • You expect to traverse the structure and apply operations recursively

Alternatives:

  • If you need to add operations without modifying classes, combine Composite with the Visitor pattern
  • If the hierarchy is shallow or operations differ significantly between types, simple inheritance might suffice
  • For immutable structures, consider a functional approach with pattern matching

Conclusion

The Composite pattern solves a specific problem elegantly: treating individual objects and groups of objects uniformly. The file system example demonstrates this perfectly—calculating size, searching, and displaying work identically whether you’re dealing with a single file or a deeply nested directory tree.

The pattern’s power comes from pushing behavior into the objects themselves and letting polymorphism handle the recursion. Your client code calls get_size() and doesn’t care whether it’s asking a file or a directory containing thousands of files.

Beyond file systems, consider the Composite pattern for:

  • UI component trees (panels containing buttons, labels, and other panels)
  • Organization hierarchies (departments containing employees and sub-departments)
  • Menu systems (menus containing items and sub-menus)
  • Document structures (sections containing paragraphs, images, and sub-sections)
  • Mathematical expressions (operations containing operands and sub-expressions)

The next time you find yourself writing isinstance() checks to handle containers differently from their contents, reach for the Composite pattern. Your future self will thank you.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.