Python - Merge/Combine Two Lists
The plus operator creates a new list by combining elements from both source lists. This approach is intuitive and commonly used for simple merging operations.
Key Insights
- Python offers multiple methods to merge lists including
+operator,extend(), unpacking, anditertools.chain(), each with distinct performance characteristics and use cases - List concatenation creates new objects while
extend()modifies in-place, making the choice critical for memory management in large-scale applications - Understanding shallow vs deep copying behavior when merging nested lists prevents subtle bugs in production code
Basic Concatenation with the + Operator
The plus operator creates a new list by combining elements from both source lists. This approach is intuitive and commonly used for simple merging operations.
list1 = [1, 2, 3]
list2 = [4, 5, 6]
merged = list1 + list2
print(merged) # [1, 2, 3, 4, 5, 6]
# Original lists remain unchanged
print(list1) # [1, 2, 3]
print(list2) # [4, 5, 6]
The + operator works with multiple lists:
list1 = ['a', 'b']
list2 = ['c', 'd']
list3 = ['e', 'f']
result = list1 + list2 + list3
print(result) # ['a', 'b', 'c', 'd', 'e', 'f']
Performance consideration: Each + operation creates a new list object, which means memory allocation overhead scales with the number of operations. For merging many lists, this becomes inefficient.
In-Place Modification with extend()
The extend() method modifies the original list by appending all elements from another iterable. This is more memory-efficient when you don’t need to preserve the original list.
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list1.extend(list2)
print(list1) # [1, 2, 3, 4, 5, 6]
print(list2) # [4, 5, 6] - unchanged
Extending with multiple lists requires chaining:
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]
list1.extend(list2)
list1.extend(list3)
print(list1) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
The extend() method accepts any iterable, making it versatile:
numbers = [1, 2, 3]
numbers.extend(range(4, 7))
numbers.extend({7, 8, 9})
numbers.extend((10, 11))
print(numbers) # [1, 2, 3, 4, 5, 6, 8, 9, 7, 10, 11]
Unpacking with the * Operator
Python’s unpacking operator provides a clean syntax for merging multiple lists in a single expression. This method is particularly useful when working with an unknown number of lists.
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]
merged = [*list1, *list2, *list3]
print(merged) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
You can intersperse additional elements during unpacking:
list1 = ['a', 'b']
list2 = ['e', 'f']
result = [*list1, 'c', 'd', *list2]
print(result) # ['a', 'b', 'c', 'd', 'e', 'f']
Unpacking works with any iterable and handles dynamic list collections:
lists = [[1, 2], [3, 4], [5, 6]]
merged = [item for sublist in lists for item in sublist]
print(merged) # [1, 2, 3, 4, 5, 6]
# Or using unpacking with sum
merged = sum(lists, [])
print(merged) # [1, 2, 3, 4, 5, 6]
Using itertools.chain() for Efficient Iteration
The itertools.chain() function creates an iterator that efficiently combines multiple iterables without creating intermediate lists. This is optimal for memory usage with large datasets.
from itertools import chain
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]
# Returns an iterator
chained = chain(list1, list2, list3)
print(list(chained)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
For dynamic list collections, use chain.from_iterable():
from itertools import chain
lists = [[1, 2], [3, 4], [5, 6], [7, 8]]
merged = list(chain.from_iterable(lists))
print(merged) # [1, 2, 3, 4, 5, 6, 7, 8]
This approach shines when processing large datasets where you don’t need all elements in memory simultaneously:
from itertools import chain
def process_large_datasets():
dataset1 = range(1000000)
dataset2 = range(1000000, 2000000)
dataset3 = range(2000000, 3000000)
# Memory efficient - no intermediate list created
for item in chain(dataset1, dataset2, dataset3):
if item % 100000 == 0:
print(f"Processing: {item}")
process_large_datasets()
Handling Nested Lists and Deep Copying
When merging lists containing mutable objects, understand the difference between shallow and deep copying to avoid unexpected behavior.
list1 = [[1, 2], [3, 4]]
list2 = [[5, 6], [7, 8]]
# Shallow copy - nested lists are references
merged = list1 + list2
merged[0][0] = 999
print(list1) # [[999, 2], [3, 4]] - original modified!
For true independence, use copy.deepcopy():
import copy
list1 = [[1, 2], [3, 4]]
list2 = [[5, 6], [7, 8]]
merged = copy.deepcopy(list1) + copy.deepcopy(list2)
merged[0][0] = 999
print(list1) # [[1, 2], [3, 4]] - original unchanged
Performance Comparison
Here’s a practical benchmark comparing different merging methods:
import timeit
from itertools import chain
# Setup
setup = """
list1 = list(range(10000))
list2 = list(range(10000, 20000))
"""
# Test concatenation
concat_time = timeit.timeit('merged = list1 + list2', setup=setup, number=10000)
print(f"Concatenation: {concat_time:.4f}s")
# Test extend
extend_time = timeit.timeit('l = list1.copy(); l.extend(list2)', setup=setup, number=10000)
print(f"Extend: {extend_time:.4f}s")
# Test unpacking
unpack_time = timeit.timeit('merged = [*list1, *list2]', setup=setup, number=10000)
print(f"Unpacking: {unpack_time:.4f}s")
# Test chain
chain_time = timeit.timeit('merged = list(chain(list1, list2))',
setup=setup + '\nfrom itertools import chain',
number=10000)
print(f"Chain: {chain_time:.4f}s")
Practical Application: Merging Configuration Data
Real-world example merging configuration from multiple sources:
from itertools import chain
def merge_configs(*config_sources):
"""Merge multiple configuration dictionaries into a single list of settings."""
all_settings = []
for config in config_sources:
all_settings.extend(config.get('settings', []))
return all_settings
# Configuration sources
default_config = {
'settings': ['debug=false', 'timeout=30']
}
user_config = {
'settings': ['theme=dark', 'language=en']
}
env_config = {
'settings': ['api_key=xyz123', 'region=us-east']
}
merged = merge_configs(default_config, user_config, env_config)
print(merged)
# ['debug=false', 'timeout=30', 'theme=dark', 'language=en', 'api_key=xyz123', 'region=us-east']
Choose + for readability with small lists, extend() for in-place modification, unpacking for multiple lists with clean syntax, and itertools.chain() for memory-efficient processing of large datasets. Each method has its place in production code.