Python - zip() Function with Examples
Python's `zip()` function is one of those built-in tools that seems simple on the surface but becomes indispensable once you understand its power. At its core, `zip()` takes multiple iterables and...
Key Insights
- The
zip()function combines multiple iterables element-wise into tuples, enabling clean parallel iteration without manual index tracking - It returns a lazy iterator that stops at the shortest iterable by default—use
itertools.zip_longest()when you need to preserve all elements - Combined with the
*operator,zip()can unzip data back into separate sequences, making it invaluable for data transformation tasks
Introduction to zip()
Python’s zip() function is one of those built-in tools that seems simple on the surface but becomes indispensable once you understand its power. At its core, zip() takes multiple iterables and combines them element-wise into tuples. The first elements go together, the second elements go together, and so on.
You’ll reach for zip() constantly in real-world code: iterating over multiple lists in parallel, creating dictionaries from separate key and value lists, transposing rows and columns, and restructuring data without writing nested loops. It’s the Pythonic answer to the age-old question of “how do I loop through two lists at the same time?”
If you’ve ever written code like for i in range(len(list1)): just to access list1[i] and list2[i] together, zip() is about to change your life.
Basic Syntax and Parameters
The function signature is straightforward:
zip(*iterables)
The *iterables parameter means zip() accepts any number of iterables—zero, one, two, or more. It returns a zip object, which is an iterator that yields tuples containing elements from each iterable.
Here’s the basic behavior:
names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
# zip() returns a zip object (iterator)
zipped = zip(names, ages)
print(zipped) # <zip object at 0x...>
print(type(zipped)) # <class 'zip'>
# Convert to list to see the contents
result = list(zip(names, ages))
print(result) # [('Alice', 25), ('Bob', 30), ('Charlie', 35)]
A few things to note here. The zip object is an iterator, not a list. You can iterate over it once, but it’s exhausted after that. If you need to use the zipped data multiple times, convert it to a list or tuple first.
You can also zip more than two iterables:
names = ['Alice', 'Bob', 'Charlie']
ages = [25, 30, 35]
cities = ['NYC', 'LA', 'Chicago']
combined = list(zip(names, ages, cities))
print(combined)
# [('Alice', 25, 'NYC'), ('Bob', 30, 'LA'), ('Charlie', 35, 'Chicago')]
Iterating Over Multiple Sequences
The most common use case for zip() is parallel iteration. Instead of tracking indices manually, you iterate directly over the paired elements:
students = ['Alice', 'Bob', 'Charlie', 'Diana']
scores = [85, 92, 78, 95]
# The clean way
for student, score in zip(students, scores):
print(f"{student}: {score}")
# Output:
# Alice: 85
# Bob: 92
# Charlie: 78
# Diana: 95
Compare this to the index-based approach:
# The ugly way - don't do this
for i in range(len(students)):
print(f"{students[i]}: {scores[i]}")
The zip() version is more readable, less error-prone, and follows Python’s philosophy of iterating over items rather than indices. It also handles the edge case where lists have different lengths more gracefully than the index-based approach, which would raise an IndexError.
You can zip three or more sequences just as easily:
students = ['Alice', 'Bob', 'Charlie']
midterm_scores = [85, 92, 78]
final_scores = [88, 90, 82]
for student, midterm, final in zip(students, midterm_scores, final_scores):
average = (midterm + final) / 2
print(f"{student}: Midterm={midterm}, Final={final}, Average={average}")
Handling Unequal Length Iterables
By default, zip() stops when the shortest iterable is exhausted. This is a deliberate design choice that prevents index errors, but it can silently drop data if you’re not careful:
names = ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
scores = [85, 92, 78] # Only 3 scores
result = list(zip(names, scores))
print(result) # [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
# Diana and Eve are silently dropped!
In Python 3.10+, you can use the strict parameter to raise an error if lengths don’t match:
# Python 3.10+
try:
result = list(zip(names, scores, strict=True))
except ValueError as e:
print(f"Error: {e}") # zip() argument 2 is shorter than argument 1
When you need to preserve all elements from unequal iterables, use itertools.zip_longest():
from itertools import zip_longest
names = ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
scores = [85, 92, 78]
# Default fill value is None
result = list(zip_longest(names, scores))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', 78), ('Diana', None), ('Eve', None)]
# Custom fill value
result = list(zip_longest(names, scores, fillvalue=0))
print(result)
# [('Alice', 85), ('Bob', 92), ('Charlie', 78), ('Diana', 0), ('Eve', 0)]
Choose the right tool for your situation: zip() when truncation is acceptable or expected, zip(..., strict=True) when lengths must match, and zip_longest() when you need all elements preserved.
Practical Applications
Creating Dictionaries from Two Lists
One of the most elegant uses of zip() is constructing dictionaries:
keys = ['name', 'age', 'city', 'occupation']
values = ['Alice', 30, 'New York', 'Engineer']
person = dict(zip(keys, values))
print(person)
# {'name': 'Alice', 'age': 30, 'city': 'New York', 'occupation': 'Engineer'}
This pattern is invaluable when working with CSV data, API responses, or any situation where you have parallel lists of keys and values.
Transposing Matrices and 2D Data
The combination of zip() with the unpacking operator * lets you transpose rows and columns:
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
# Transpose: rows become columns
transposed = list(zip(*matrix))
print(transposed)
# [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
# Convert inner tuples to lists if needed
transposed_lists = [list(row) for row in zip(*matrix)]
print(transposed_lists)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
The *matrix unpacks the outer list, passing each row as a separate argument to zip(). This is a concise alternative to nested loops.
Unzipping with the * Operator
You can reverse the zipping process using the same * unpacking trick:
pairs = [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
# Unzip into separate tuples
names, scores = zip(*pairs)
print(names) # ('Alice', 'Bob', 'Charlie')
print(scores) # (85, 92, 78)
# Convert to lists if needed
names_list = list(names)
scores_list = list(scores)
This pattern is useful when you receive paired data but need to process each dimension separately.
Performance Considerations
The zip() function returns an iterator, which means it generates tuples on demand rather than creating them all upfront. This lazy evaluation makes zip() memory-efficient when working with large datasets:
import sys
# Creating a list of tuples upfront
list1 = list(range(10000))
list2 = list(range(10000))
# zip object is tiny - just stores references to iterables
zipped = zip(list1, list2)
print(f"zip object size: {sys.getsizeof(zipped)} bytes")
# Converting to list allocates memory for all tuples
zipped_list = list(zip(list1, list2))
print(f"list of tuples size: {sys.getsizeof(zipped_list)} bytes")
# Output (approximate):
# zip object size: 64 bytes
# list of tuples size: 80056 bytes
When processing large files or streams, iterate directly over the zip object instead of converting to a list:
# Memory efficient - processes one pair at a time
for name, score in zip(names_generator, scores_generator):
process(name, score)
# Memory inefficient - loads everything into memory
for name, score in list(zip(names_generator, scores_generator)):
process(name, score)
Remember that a zip object can only be iterated once. If you need multiple passes through the data, either convert to a list or recreate the zip object.
Summary
The zip() function is a fundamental tool for working with multiple sequences in Python. Here’s what you need to remember:
- It combines iterables element-wise into tuples
- It returns a lazy iterator, making it memory-efficient
- It stops at the shortest iterable by default
- Use
strict=True(Python 3.10+) to enforce equal lengths - Use
itertools.zip_longest()to preserve all elements - Combined with
*, it can transpose data and unzip sequences
Quick reference for common patterns:
# Parallel iteration
for a, b in zip(list1, list2): ...
# Create dictionary
dict(zip(keys, values))
# Transpose matrix
list(zip(*matrix))
# Unzip pairs
names, values = zip(*pairs)
# Handle unequal lengths
from itertools import zip_longest
list(zip_longest(a, b, fillvalue=0))
Master these patterns and you’ll write cleaner, more Pythonic code when working with parallel data structures.