Python - Merge Two Dictionaries

Python provides multiple approaches to merge dictionaries, each with distinct performance characteristics and use cases. The most straightforward method uses the `update()` method, which modifies the...

Key Insights

  • Python 3.9+ offers the merge operator (|) as the most concise way to combine dictionaries, while the ** unpacking operator works across all Python 3+ versions
  • Dictionary merge operations handle key conflicts by prioritizing values from the rightmost dictionary, making order critical when merging data with overlapping keys
  • For complex merging scenarios requiring custom conflict resolution or deep merging of nested structures, the ChainMap class and recursive approaches provide production-ready solutions

Basic Dictionary Merging Techniques

Python provides multiple approaches to merge dictionaries, each with distinct performance characteristics and use cases. The most straightforward method uses the update() method, which modifies the original dictionary in place:

user_defaults = {'theme': 'dark', 'language': 'en', 'notifications': True}
user_preferences = {'theme': 'light', 'timezone': 'UTC'}

user_defaults.update(user_preferences)
print(user_defaults)
# Output: {'theme': 'light', 'language': 'en', 'notifications': True, 'timezone': 'UTC'}

The update() method mutates the original dictionary, which may not be desirable when you need to preserve the original data structures. For immutable operations, dictionary unpacking with the ** operator creates a new dictionary:

config_defaults = {'timeout': 30, 'retries': 3, 'debug': False}
config_overrides = {'timeout': 60, 'log_level': 'INFO'}

merged_config = {**config_defaults, **config_overrides}
print(merged_config)
# Output: {'timeout': 60, 'retries': 3, 'debug': False, 'log_level': 'INFO'}

The Merge Operator (Python 3.9+)

Python 3.9 introduced the | operator for dictionary merging, providing cleaner syntax with the same semantics as unpacking:

database_config = {'host': 'localhost', 'port': 5432, 'pool_size': 10}
production_overrides = {'host': 'prod.db.example.com', 'ssl': True}

final_config = database_config | production_overrides
print(final_config)
# Output: {'host': 'prod.db.example.com', 'port': 5432, 'pool_size': 10, 'ssl': True}

The augmented assignment operator |= provides an in-place alternative:

settings = {'api_version': 'v2', 'cache_enabled': False}
environment_vars = {'cache_enabled': True, 'rate_limit': 1000}

settings |= environment_vars
print(settings)
# Output: {'api_version': 'v2', 'cache_enabled': True, 'rate_limit': 1000}

Merging Multiple Dictionaries

Real-world applications often require merging more than two dictionaries, such as combining default settings, environment-specific configs, and user preferences:

defaults = {'timeout': 30, 'retries': 3, 'compression': 'gzip'}
environment = {'timeout': 60, 'endpoint': 'https://api.example.com'}
user_settings = {'retries': 5, 'custom_header': 'X-Client-ID'}

# Using unpacking
final = {**defaults, **environment, **user_settings}
print(final)
# Output: {'timeout': 60, 'retries': 5, 'compression': 'gzip', 
#          'endpoint': 'https://api.example.com', 'custom_header': 'X-Client-ID'}

# Using merge operator (Python 3.9+)
final = defaults | environment | user_settings
print(final)
# Same output as above

For dynamic scenarios with an arbitrary number of dictionaries, use reduce() from the functools module:

from functools import reduce
import operator

config_layers = [
    {'service': 'api', 'version': '1.0'},
    {'version': '2.0', 'auth': 'bearer'},
    {'endpoint': '/v2/data', 'auth': 'oauth2'},
    {'timeout': 45}
]

merged = reduce(operator.or_, config_layers)
print(merged)
# Output: {'service': 'api', 'version': '2.0', 'auth': 'oauth2', 
#          'endpoint': '/v2/data', 'timeout': 45}

ChainMap for Layered Configuration

The collections.ChainMap class provides a memory-efficient approach for layered dictionaries without creating copies:

from collections import ChainMap

system_defaults = {'log_level': 'WARNING', 'max_connections': 100}
app_config = {'log_level': 'INFO', 'app_name': 'DataProcessor'}
runtime_overrides = {'max_connections': 200}

config = ChainMap(runtime_overrides, app_config, system_defaults)

print(config['log_level'])      # Output: INFO
print(config['max_connections']) # Output: 200
print(config['app_name'])        # Output: DataProcessor

# Convert to regular dict if needed
final_dict = dict(config)
print(final_dict)
# Output: {'max_connections': 200, 'log_level': 'INFO', 'app_name': 'DataProcessor'}

ChainMap searches through the dictionaries in order and returns the first match, making it ideal for configuration hierarchies where you want to maintain separate layers:

from collections import ChainMap

def get_config(user_id):
    global_settings = {'theme': 'light', 'language': 'en', 'timeout': 30}
    user_prefs = load_user_preferences(user_id)  # Hypothetical function
    session_data = {'timeout': 60, 'session_id': 'abc123'}
    
    return ChainMap(session_data, user_prefs, global_settings)

def load_user_preferences(user_id):
    return {'theme': 'dark', 'notifications': True}

config = get_config('user_42')
print(dict(config))
# Output: {'timeout': 60, 'session_id': 'abc123', 'theme': 'dark', 
#          'notifications': True, 'language': 'en'}

Deep Merging Nested Dictionaries

Simple merge operations don’t handle nested dictionaries recursively. When merging configuration objects with nested structures, you need a custom deep merge function:

def deep_merge(base, override):
    """Recursively merge override into base."""
    result = base.copy()
    
    for key, value in override.items():
        if key in result and isinstance(result[key], dict) and isinstance(value, dict):
            result[key] = deep_merge(result[key], value)
        else:
            result[key] = value
    
    return result

base_config = {
    'database': {
        'host': 'localhost',
        'port': 5432,
        'credentials': {
            'user': 'admin',
            'password': 'default'
        }
    },
    'cache': {
        'enabled': True
    }
}

override_config = {
    'database': {
        'host': 'prod.example.com',
        'credentials': {
            'password': 'secure_password'
        }
    },
    'cache': {
        'ttl': 3600
    }
}

merged = deep_merge(base_config, override_config)
print(merged)
# Output: {
#     'database': {
#         'host': 'prod.example.com',
#         'port': 5432,
#         'credentials': {'user': 'admin', 'password': 'secure_password'}
#     },
#     'cache': {'enabled': True, 'ttl': 3600}
# }

Custom Merge Strategies

For production applications requiring specific conflict resolution logic, implement custom merge functions:

def merge_with_strategy(dict1, dict2, strategy='override'):
    """
    Merge dictionaries with configurable conflict resolution.
    
    Strategies:
    - override: dict2 values take precedence (default)
    - keep: dict1 values take precedence
    - combine: combine values into lists
    """
    result = dict1.copy()
    
    for key, value in dict2.items():
        if key not in result:
            result[key] = value
        elif strategy == 'override':
            result[key] = value
        elif strategy == 'keep':
            pass  # Keep existing value
        elif strategy == 'combine':
            if isinstance(result[key], list):
                result[key].append(value)
            else:
                result[key] = [result[key], value]
    
    return result

metrics_a = {'requests': 1000, 'errors': 5, 'endpoint': '/api/v1'}
metrics_b = {'requests': 1500, 'latency': 250, 'endpoint': '/api/v2'}

print(merge_with_strategy(metrics_a, metrics_b, 'override'))
# Output: {'requests': 1500, 'errors': 5, 'endpoint': '/api/v2', 'latency': 250}

print(merge_with_strategy(metrics_a, metrics_b, 'keep'))
# Output: {'requests': 1000, 'errors': 5, 'endpoint': '/api/v1', 'latency': 250}

print(merge_with_strategy(metrics_a, metrics_b, 'combine'))
# Output: {'requests': [1000, 1500], 'errors': 5, 'endpoint': ['/api/v1', '/api/v2'], 
#          'latency': 250}

Performance Considerations

For performance-critical applications, benchmark different merge approaches:

import timeit

setup = """
dict1 = {f'key_{i}': i for i in range(1000)}
dict2 = {f'key_{i}': i * 2 for i in range(500, 1500)}
"""

print("update():", timeit.timeit('d = dict1.copy(); d.update(dict2)', setup=setup, number=10000))
print("unpacking:", timeit.timeit('d = {**dict1, **dict2}', setup=setup, number=10000))
print("merge operator:", timeit.timeit('d = dict1 | dict2', setup=setup, number=10000))

# Typical results (times vary by system):
# update(): ~0.15s
# unpacking: ~0.16s  
# merge operator: ~0.16s

The | operator and ** unpacking show similar performance, while update() on a copy is marginally faster. Choose based on readability and whether you need to preserve the original dictionary.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.