Python - Sort List of Tuples
Python sorts lists of tuples lexicographically by default. The comparison starts with the first element of each tuple, then moves to subsequent elements if the first ones are equal.
Key Insights
- Python provides multiple sorting methods for tuple lists:
sorted()function,list.sort()method, anditemgetter()from the operator module for optimal performance - Tuples sort lexicographically by default (element-by-element), but you can customize sorting using
keyparameter and lambda functions to target specific indices - For complex sorting scenarios, combine multiple sort keys using tuples in lambda functions or chain multiple sorts in reverse order for stable sort behavior
Default Sorting Behavior
Python sorts lists of tuples lexicographically by default. The comparison starts with the first element of each tuple, then moves to subsequent elements if the first ones are equal.
data = [(3, 'apple'), (1, 'banana'), (2, 'cherry'), (1, 'apricot')]
sorted_data = sorted(data)
print(sorted_data)
# Output: [(1, 'apricot'), (1, 'banana'), (2, 'cherry'), (3, 'apple')]
The sorted() function returns a new list, leaving the original unchanged. For in-place sorting, use the sort() method:
data = [(3, 'apple'), (1, 'banana'), (2, 'cherry')]
data.sort()
print(data)
# Output: [(1, 'banana'), (2, 'cherry'), (3, 'apple')]
Sorting by Specific Tuple Index
Use lambda functions with the key parameter to sort by a specific tuple element:
# Sort by second element (index 1)
products = [('laptop', 1200), ('mouse', 25), ('keyboard', 80), ('monitor', 300)]
sorted_by_price = sorted(products, key=lambda x: x[1])
print(sorted_by_price)
# Output: [('mouse', 25), ('keyboard', 80), ('monitor', 300), ('laptop', 1200)]
# Sort by first element
sorted_by_name = sorted(products, key=lambda x: x[0])
print(sorted_by_name)
# Output: [('keyboard', 80), ('laptop', 1200), ('monitor', 300), ('mouse', 25)]
For better performance with large datasets, use operator.itemgetter():
from operator import itemgetter
products = [('laptop', 1200), ('mouse', 25), ('keyboard', 80)]
sorted_by_price = sorted(products, key=itemgetter(1))
print(sorted_by_price)
# Output: [('mouse', 25), ('keyboard', 80), ('laptop', 1200)]
The itemgetter() approach is faster than lambda functions because it’s implemented in C.
Reverse Sorting
Add reverse=True to sort in descending order:
sales = [('Q1', 50000), ('Q2', 75000), ('Q3', 60000), ('Q4', 90000)]
sorted_desc = sorted(sales, key=lambda x: x[1], reverse=True)
print(sorted_desc)
# Output: [('Q4', 90000), ('Q2', 75000), ('Q3', 60000), ('Q1', 50000)]
Multi-Level Sorting
Sort by multiple criteria by returning a tuple from the key function:
# Sort by department, then by salary (descending)
employees = [
('Alice', 'Engineering', 95000),
('Bob', 'Sales', 70000),
('Charlie', 'Engineering', 85000),
('Diana', 'Sales', 75000),
('Eve', 'Engineering', 90000)
]
sorted_employees = sorted(employees, key=lambda x: (x[1], -x[2]))
print(sorted_employees)
# Output:
# [('Alice', 'Engineering', 95000),
# ('Eve', 'Engineering', 90000),
# ('Charlie', 'Engineering', 85000),
# ('Diana', 'Sales', 75000),
# ('Bob', 'Sales', 70000)]
Note the negative sign before x[2] to sort salary in descending order while keeping department in ascending order.
For string fields that need reverse sorting, use a separate sort or reverse the comparison:
students = [
('Alice', 'Physics', 85),
('Bob', 'Math', 92),
('Charlie', 'Physics', 90),
('Diana', 'Math', 88)
]
# Sort by subject (ascending), then grade (descending)
sorted_students = sorted(students, key=lambda x: (x[1], -x[2]))
print(sorted_students)
# Output:
# [('Bob', 'Math', 92),
# ('Diana', 'Math', 88),
# ('Charlie', 'Physics', 90),
# ('Alice', 'Physics', 85)]
Sorting with Custom Comparison Logic
For complex sorting requirements, implement custom logic in the key function:
# Sort by string length, then alphabetically
words = [('apple', 5), ('pie', 3), ('banana', 6), ('cat', 3), ('dog', 3)]
sorted_words = sorted(words, key=lambda x: (len(x[0]), x[0]))
print(sorted_words)
# Output: [('cat', 3), ('dog', 3), ('pie', 3), ('apple', 5), ('banana', 6)]
# Sort by absolute value of second element
coordinates = [('A', -5), ('B', 3), ('C', -8), ('D', 2)]
sorted_coords = sorted(coordinates, key=lambda x: abs(x[1]))
print(sorted_coords)
# Output: [('D', 2), ('B', 3), ('A', -5), ('C', -8)]
Stable Sorting for Complex Scenarios
Python’s sort is stable, meaning elements with equal keys maintain their original relative order. Exploit this for multi-level sorting:
# Sort by priority, then by timestamp
tasks = [
('Task A', 'high', '2024-01-15'),
('Task B', 'low', '2024-01-10'),
('Task C', 'high', '2024-01-12'),
('Task D', 'medium', '2024-01-14'),
('Task E', 'high', '2024-01-11')
]
# Sort by timestamp first
tasks.sort(key=lambda x: x[2])
# Then by priority (stable sort preserves timestamp order within priorities)
priority_order = {'high': 0, 'medium': 1, 'low': 2}
tasks.sort(key=lambda x: priority_order[x[1]])
print(tasks)
# Output:
# [('Task E', 'high', '2024-01-11'),
# ('Task C', 'high', '2024-01-12'),
# ('Task A', 'high', '2024-01-15'),
# ('Task D', 'medium', '2024-01-14'),
# ('Task B', 'low', '2024-01-10')]
Sorting Named Tuples
Named tuples provide cleaner syntax:
from collections import namedtuple
Product = namedtuple('Product', ['name', 'price', 'stock'])
inventory = [
Product('Laptop', 1200, 15),
Product('Mouse', 25, 100),
Product('Keyboard', 80, 45)
]
# Sort by price
sorted_inventory = sorted(inventory, key=lambda p: p.price)
# Or using itemgetter with attribute name
from operator import attrgetter
sorted_inventory = sorted(inventory, key=attrgetter('price'))
print(sorted_inventory)
# Output:
# [Product(name='Mouse', price=25, stock=100),
# Product(name='Keyboard', price=80, stock=45),
# Product(name='Laptop', price=1200, stock=15)]
Performance Considerations
For large datasets, choose the right approach:
import timeit
from operator import itemgetter
data = [(i, i*2) for i in range(10000)]
# Lambda function
lambda_time = timeit.timeit(
lambda: sorted(data, key=lambda x: x[1]),
number=1000
)
# itemgetter
itemgetter_time = timeit.timeit(
lambda: sorted(data, key=itemgetter(1)),
number=1000
)
print(f"Lambda: {lambda_time:.4f}s")
print(f"itemgetter: {itemgetter_time:.4f}s")
# itemgetter is typically 20-30% faster
Use itemgetter() for simple index-based sorting and lambda functions when you need custom logic or transformations.
Sorting with None Values
Handle None values explicitly to avoid comparison errors:
data = [('A', 5), ('B', None), ('C', 3), ('D', None), ('E', 8)]
# Place None values at the end
sorted_data = sorted(data, key=lambda x: (x[1] is None, x[1] if x[1] is not None else 0))
print(sorted_data)
# Output: [('C', 3), ('A', 5), ('E', 8), ('B', None), ('D', None)]
# Or use a more explicit approach
sorted_data = sorted(data, key=lambda x: (x[1] is None, x[1] or float('inf')))
These techniques cover the full spectrum of tuple sorting requirements in production Python applications. Choose the approach that balances readability with performance for your specific use case.