Python Copy: Shallow vs Deep Copy Explained

Python's assignment operator doesn't copy objects—it creates new references to existing objects. This behavior catches many developers off guard, especially when working with mutable data structures...

Key Insights

  • Assignment in Python creates references to the same object, not copies—modifying one variable affects all references to that object
  • Shallow copies duplicate the outer container but share references to nested objects, making them fast but potentially dangerous with mutable nested data
  • Deep copies recursively duplicate all nested objects for complete independence, at the cost of performance and memory overhead

The Problem with Assignment

Python’s assignment operator doesn’t copy objects—it creates new references to existing objects. This behavior catches many developers off guard, especially when working with mutable data structures like lists and dictionaries.

original = [1, 2, 3]
copy = original
copy.append(4)

print(original)  # [1, 2, 3, 4]
print(copy)      # [1, 2, 3, 4]
print(id(original) == id(copy))  # True

Both variables point to the same object in memory. When you modify copy, you’re modifying original because they’re the same object. The id() function confirms they share the same memory address.

This behavior is intentional and efficient—Python avoids unnecessary duplication. But when you actually need independent copies, you need to be explicit about it.

Shallow Copy: Copying the Surface

A shallow copy creates a new object but doesn’t recursively copy nested objects. Instead, it copies references to those nested objects. Think of it as copying the container but sharing the contents.

Python provides several ways to create shallow copies:

import copy

# Method 1: List's copy() method
original = [[1, 2], [3, 4]]
shallow1 = original.copy()

# Method 2: Constructor
shallow2 = list(original)

# Method 3: Slicing
shallow3 = original[:]

# Method 4: copy.copy()
shallow4 = copy.copy(original)

All four methods produce the same result. Here’s where shallow copying gets interesting:

original = [[1, 2], [3, 4]]
shallow = original.copy()

# Modifying the outer list
shallow.append([5, 6])
print(original)  # [[1, 2], [3, 4]]
print(shallow)   # [[1, 2], [3, 4], [5, 6]]

# Modifying a nested list
shallow[0].append(99)
print(original)  # [[1, 2, 99], [3, 4]]
print(shallow)   # [[1, 2, 99], [3, 4], [5, 6]]

# Memory addresses prove it
print(id(original[0]) == id(shallow[0]))  # True

The outer list is copied, but the nested lists are shared. Appending to the outer structure works independently, but modifying nested objects affects both copies.

Deep Copy: Copying All the Way Down

Deep copy recursively duplicates every object in the structure, creating complete independence. Use copy.deepcopy() from the copy module:

import copy

original = [[1, 2], [3, 4]]
deep = copy.deepcopy(original)

# Modify nested list
deep[0].append(99)
print(original)  # [[1, 2], [3, 4]]
print(deep)      # [[1, 2, 99], [3, 4]]

# Memory addresses are different
print(id(original[0]) == id(deep[0]))  # False

Now the nested lists are truly independent. Deep copy traverses the entire object graph, creating new copies of everything it finds.

Practical Comparison with Common Data Structures

Let’s compare all three approaches with a complex nested dictionary:

import copy

original = {
    'name': 'John',
    'scores': [85, 90, 78],
    'metadata': {
        'attempts': 3,
        'tags': ['python', 'advanced']
    }
}

# Assignment (reference)
assigned = original

# Shallow copy
shallow = original.copy()

# Deep copy
deep = copy.deepcopy(original)

# Modify nested list
original['scores'].append(95)

# Modify nested dictionary
original['metadata']['attempts'] = 5

print("Assigned:", assigned['scores'])  # [85, 90, 78, 95]
print("Shallow:", shallow['scores'])    # [85, 90, 78, 95]
print("Deep:", deep['scores'])          # [85, 90, 78]

print("Assigned attempts:", assigned['metadata']['attempts'])  # 5
print("Shallow attempts:", shallow['metadata']['attempts'])    # 5
print("Deep attempts:", deep['metadata']['attempts'])          # 3

The assignment shares everything. The shallow copy creates a new outer dictionary but shares nested objects. Only the deep copy is truly independent.

When to Use Each Approach

Use assignment when you want multiple variables to reference the same object. This is common when passing objects to functions or creating aliases:

def process_data(data):
    # data is just a reference, no copying
    return [x * 2 for x in data]

Use shallow copy for flat data structures or when you intentionally want to share nested objects. It’s fast and memory-efficient:

# Configuration with shared references
default_config = {
    'timeout': 30,
    'retry': 3,
    'endpoints': ['api.example.com']  # Shared list
}

# Multiple services share endpoint list
service1_config = default_config.copy()
service2_config = default_config.copy()

# Adding endpoint affects all services (intentional)
default_config['endpoints'].append('backup.example.com')

Use deep copy when you need complete independence, especially for complex state management:

import copy

class GameState:
    def __init__(self, player_positions, inventory):
        self.player_positions = player_positions
        self.inventory = inventory

# Save game state for undo functionality
current_state = GameState(
    player_positions={'player1': [10, 20], 'player2': [15, 25]},
    inventory={'player1': ['sword', 'shield']}
)

saved_state = copy.deepcopy(current_state)

# Modify current state
current_state.player_positions['player1'][0] = 50
current_state.inventory['player1'].append('potion')

# Saved state remains unchanged
print(saved_state.player_positions['player1'])  # [10, 20]
print(saved_state.inventory['player1'])  # ['sword', 'shield']

Common Pitfalls and Edge Cases

Circular references are handled automatically by deepcopy():

import copy

original = {'name': 'Node'}
original['self'] = original  # Circular reference

deep = copy.deepcopy(original)
print(deep['self'] is deep)  # True - maintains structure

Immutable objects like strings, numbers, and tuples are never copied—Python reuses them for efficiency:

original = (1, 2, [3, 4])
shallow = copy.copy(original)

print(id(original) == id(shallow))  # True - tuples are immutable
print(id(original[2]) == id(shallow[2]))  # True - nested list is shared

Custom objects need special handling if they have non-standard copying requirements:

import copy

class Connection:
    def __init__(self, host):
        self.host = host
        self.socket = self._connect()
    
    def _connect(self):
        return f"socket_to_{self.host}"
    
    def __copy__(self):
        # Shallow copy: new object, same socket
        new_obj = Connection.__new__(Connection)
        new_obj.host = self.host
        new_obj.socket = self.socket
        return new_obj
    
    def __deepcopy__(self, memo):
        # Deep copy: new object, new connection
        new_obj = Connection.__new__(Connection)
        new_obj.host = copy.deepcopy(self.host, memo)
        new_obj.socket = new_obj._connect()
        return new_obj

conn = Connection('localhost')
shallow = copy.copy(conn)
deep = copy.deepcopy(conn)

print(id(conn.socket) == id(shallow.socket))  # True
print(id(conn.socket) == id(deep.socket))     # False

Performance Benchmarks and Best Practices

Deep copying is significantly slower than shallow copying, especially with large nested structures:

import copy
import timeit

# Flat list
flat_data = list(range(10000))

# Nested list
nested_data = [[i] * 100 for i in range(100)]

# Benchmark
shallow_flat = timeit.timeit(lambda: flat_data.copy(), number=10000)
deep_flat = timeit.timeit(lambda: copy.deepcopy(flat_data), number=10000)

shallow_nested = timeit.timeit(lambda: nested_data.copy(), number=10000)
deep_nested = timeit.timeit(lambda: copy.deepcopy(nested_data), number=10000)

print(f"Flat - Shallow: {shallow_flat:.4f}s, Deep: {deep_flat:.4f}s")
print(f"Nested - Shallow: {shallow_nested:.4f}s, Deep: {deep_nested:.4f}s")

# Typical output:
# Flat - Shallow: 0.0234s, Deep: 0.0456s
# Nested - Shallow: 0.0012s, Deep: 0.2341s

Best practices:

  1. Default to assignment unless you specifically need a copy
  2. Use shallow copy for simple data structures without nested mutables
  3. Reserve deep copy for complex state management, undo systems, or when complete independence is required
  4. Profile your code if copying becomes a bottleneck—sometimes restructuring data is better than copying
  5. Document your intent when using shallow copies with nested structures to avoid confusion

Choose the right copying strategy based on your data structure and requirements. Assignment is fastest but shares references. Shallow copy provides a middle ground. Deep copy guarantees independence at the cost of performance. Understanding these trade-offs makes you a more effective Python developer.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.