Python - sorted() Function with Custom Key
Python's `sorted()` function returns a new sorted list from any iterable. While basic sorting works fine for simple lists, real-world data rarely cooperates. You'll need to sort users by registration...
Key Insights
- The
keyparameter transforms each element before comparison, enabling custom sort logic without modifying your original data - Use
operator.itemgetter()andoperator.attrgetter()instead of lambdas for cleaner code and better performance when sorting by indices or attributes - Handle edge cases like
Nonevalues explicitly in your key function to avoidTypeErrorexceptions during sorting
Introduction to sorted() and the key Parameter
Python’s sorted() function returns a new sorted list from any iterable. While basic sorting works fine for simple lists, real-world data rarely cooperates. You’ll need to sort users by registration date, products by price, or log entries by severity. The key parameter makes this possible.
The key parameter accepts a function that transforms each element before comparison. Python applies this function to every item, then sorts based on the transformed values—but returns the original items in the new order.
# Basic sorting - alphabetical order
words = ['banana', 'Apple', 'cherry', 'date']
print(sorted(words))
# ['Apple', 'banana', 'cherry', 'date']
# With key - sort by string length
print(sorted(words, key=len))
# ['date', 'Apple', 'banana', 'cherry']
# The original items are returned, not the lengths
The key function receives one argument (the current element) and returns a value Python can compare. This separation between “what to compare” and “what to return” gives you precise control over sorting behavior.
Using Lambda Functions as Keys
Lambda expressions provide inline, anonymous functions perfect for simple transformations. They’re the most common way to define custom sort keys.
# Sort numbers by absolute value
numbers = [-5, 2, -1, 8, -3]
print(sorted(numbers, key=lambda x: abs(x)))
# [-1, 2, -3, -5, 8]
# Sort strings by their last character
names = ['alice', 'bob', 'charlie', 'david']
print(sorted(names, key=lambda x: x[-1]))
# ['bob', 'alice', 'charlie', 'david']
Tuples deserve special attention. When you sort a list of tuples, Python compares them element by element. But often you want to sort by a specific position:
# List of (name, score) tuples
students = [
('Alice', 85),
('Bob', 92),
('Charlie', 78),
('Diana', 92)
]
# Sort by score (second element)
by_score = sorted(students, key=lambda x: x[1])
print(by_score)
# [('Charlie', 78), ('Alice', 85), ('Bob', 92), ('Diana', 92)]
# Sort by score descending
by_score_desc = sorted(students, key=lambda x: x[1], reverse=True)
print(by_score_desc)
# [('Bob', 92), ('Diana', 92), ('Alice', 85), ('Charlie', 78)]
Sorting Complex Objects
Dictionaries and custom objects require key functions that extract the relevant field or attribute.
# Sorting a list of dictionaries
users = [
{'name': 'Alice', 'age': 30, 'city': 'NYC'},
{'name': 'Bob', 'age': 25, 'city': 'LA'},
{'name': 'Charlie', 'age': 35, 'city': 'Chicago'},
]
# Sort by age
by_age = sorted(users, key=lambda u: u['age'])
print([u['name'] for u in by_age])
# ['Bob', 'Alice', 'Charlie']
# Sort by name
by_name = sorted(users, key=lambda u: u['name'])
print([u['name'] for u in by_name])
# ['Alice', 'Bob', 'Charlie']
For custom classes, access attributes directly:
class Employee:
def __init__(self, name, salary, department):
self.name = name
self.salary = salary
self.department = department
def __repr__(self):
return f"Employee({self.name}, ${self.salary})"
employees = [
Employee('Alice', 75000, 'Engineering'),
Employee('Bob', 65000, 'Marketing'),
Employee('Charlie', 85000, 'Engineering'),
]
# Sort by salary
by_salary = sorted(employees, key=lambda e: e.salary)
print(by_salary)
# [Employee(Bob, $65000), Employee(Alice, $75000), Employee(Charlie, $85000)]
# Sort by name
by_name = sorted(employees, key=lambda e: e.name)
print(by_name)
# [Employee(Alice, $75000), Employee(Bob, $65000), Employee(Charlie, $85000)]
Using operator Module Functions
The operator module provides itemgetter, attrgetter, and methodcaller as optimized alternatives to lambdas. They’re faster because they’re implemented in C, and they’re more readable for common patterns.
from operator import itemgetter, attrgetter
# itemgetter for sequences and dictionaries
students = [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
by_score = sorted(students, key=itemgetter(1))
print(by_score)
# [('Charlie', 78), ('Alice', 85), ('Bob', 92)]
# Works with dictionaries too
users = [
{'name': 'Alice', 'age': 30},
{'name': 'Bob', 'age': 25},
]
by_age = sorted(users, key=itemgetter('age'))
print(by_age)
# [{'name': 'Bob', 'age': 25}, {'name': 'Alice', 'age': 30}]
For objects with attributes, use attrgetter:
from operator import attrgetter
# Using the Employee class from before
employees = [
Employee('Alice', 75000, 'Engineering'),
Employee('Bob', 65000, 'Marketing'),
Employee('Charlie', 85000, 'Engineering'),
]
by_salary = sorted(employees, key=attrgetter('salary'))
print(by_salary)
# [Employee(Bob, $65000), Employee(Alice, $75000), Employee(Charlie, $85000)]
The performance difference matters when sorting large collections:
import timeit
from operator import itemgetter
data = [(i, i * 2) for i in range(10000)]
# Lambda approach
lambda_time = timeit.timeit(
lambda: sorted(data, key=lambda x: x[1]),
number=1000
)
# itemgetter approach
itemgetter_time = timeit.timeit(
lambda: sorted(data, key=itemgetter(1)),
number=1000
)
print(f"Lambda: {lambda_time:.3f}s")
print(f"itemgetter: {itemgetter_time:.3f}s")
# Typically itemgetter is 10-20% faster
Multi-Level Sorting
Real applications often require sorting by multiple criteria. Python’s sort is stable, meaning equal elements maintain their relative order. You can exploit this with multiple passes, but returning a tuple from your key function is cleaner:
from operator import itemgetter
employees = [
{'name': 'Alice', 'dept': 'Engineering', 'salary': 75000},
{'name': 'Bob', 'dept': 'Marketing', 'salary': 65000},
{'name': 'Charlie', 'dept': 'Engineering', 'salary': 85000},
{'name': 'Diana', 'dept': 'Marketing', 'salary': 70000},
{'name': 'Eve', 'dept': 'Engineering', 'salary': 75000},
]
# Sort by department, then by salary within each department
sorted_emps = sorted(employees, key=lambda e: (e['dept'], e['salary']))
for emp in sorted_emps:
print(f"{emp['dept']:12} {emp['name']:10} ${emp['salary']}")
# Engineering Alice $75000
# Engineering Eve $75000
# Engineering Charlie $85000
# Marketing Bob $65000
# Marketing Diana $70000
For mixed ascending/descending sorts, negate numeric values or use multiple passes:
# Sort by department ascending, salary descending
sorted_emps = sorted(employees, key=lambda e: (e['dept'], -e['salary']))
for emp in sorted_emps:
print(f"{emp['dept']:12} {emp['name']:10} ${emp['salary']}")
# Engineering Charlie $85000
# Engineering Alice $75000
# Engineering Eve $75000
# Marketing Diana $70000
# Marketing Bob $65000
With itemgetter, you can select multiple keys at once:
# Sort by department, then by name (both ascending)
sorted_emps = sorted(employees, key=itemgetter('dept', 'name'))
Case-Insensitive and Locale-Aware Sorting
String sorting is case-sensitive by default, with uppercase letters sorting before lowercase. Use str.lower or str.casefold for case-insensitive sorting:
names = ['alice', 'Bob', 'CHARLIE', 'Diana']
# Default: uppercase first
print(sorted(names))
# ['Bob', 'CHARLIE', 'Diana', 'alice']
# Case-insensitive with str.lower
print(sorted(names, key=str.lower))
# ['alice', 'Bob', 'CHARLIE', 'Diana']
# str.casefold handles more Unicode edge cases
print(sorted(names, key=str.casefold))
# ['alice', 'Bob', 'CHARLIE', 'Diana']
Use str.casefold() over str.lower() when handling international text. It handles cases like the German ß character correctly.
For locale-aware sorting (respecting language-specific ordering rules), use the locale module:
import locale
# Set locale (system-dependent)
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
words = ['café', 'apple', 'Banana']
sorted_words = sorted(words, key=locale.strxfrm)
Common Pitfalls and Best Practices
Handling None Values
None can’t be compared with other types in Python 3, causing TypeError. Handle it explicitly:
data = [3, 1, None, 4, None, 2]
# This raises TypeError
# sorted(data)
# Solution 1: Put None values at the end
sorted_data = sorted(data, key=lambda x: (x is None, x))
print(sorted_data)
# [1, 2, 3, 4, None, None]
# Solution 2: Put None values at the beginning
sorted_data = sorted(data, key=lambda x: (x is not None, x))
print(sorted_data)
# [None, None, 1, 2, 3, 4]
# Solution 3: Replace None with a default
sorted_data = sorted(data, key=lambda x: x if x is not None else float('inf'))
print(sorted_data)
# [1, 2, 3, 4, None, None]
Expensive Key Functions
Python calls the key function once per element, not once per comparison. But for expensive operations, cache the results using the decorate-sort-undecorate pattern (though sorted() with key does this internally):
# If your key function is expensive, sorted() handles it efficiently
# Each element is transformed exactly once
import re
def extract_number(s):
"""Expensive operation: regex parsing"""
match = re.search(r'\d+', s)
return int(match.group()) if match else 0
files = ['file10.txt', 'file2.txt', 'file1.txt']
sorted_files = sorted(files, key=extract_number)
print(sorted_files)
# ['file1.txt', 'file2.txt', 'file10.txt']
When Not to Use key
If you need comparison logic that can’t be expressed as a transformation (like fuzzy matching), you’ll need functools.cmp_to_key to convert an old-style comparison function:
from functools import cmp_to_key
def compare_versions(a, b):
"""Compare version strings like '1.2.3'"""
a_parts = [int(x) for x in a.split('.')]
b_parts = [int(x) for x in b.split('.')]
for a_val, b_val in zip(a_parts, b_parts):
if a_val < b_val:
return -1
if a_val > b_val:
return 1
return len(a_parts) - len(b_parts)
versions = ['1.2.3', '1.10.1', '1.2.10', '2.0.0']
sorted_versions = sorted(versions, key=cmp_to_key(compare_versions))
print(sorted_versions)
# ['1.2.3', '1.2.10', '1.10.1', '2.0.0']
Prefer key functions when possible—they’re more efficient than comparison functions and fit Python 3’s design philosophy.
The key parameter transforms sorted() from a basic utility into a flexible tool for any ordering requirement. Master these patterns, and you’ll handle complex sorting tasks with minimal code.