Python - sorted() Function with Custom Key

Python's `sorted()` function returns a new sorted list from any iterable. While basic sorting works fine for simple lists, real-world data rarely cooperates. You'll need to sort users by registration...

Key Insights

  • The key parameter transforms each element before comparison, enabling custom sort logic without modifying your original data
  • Use operator.itemgetter() and operator.attrgetter() instead of lambdas for cleaner code and better performance when sorting by indices or attributes
  • Handle edge cases like None values explicitly in your key function to avoid TypeError exceptions during sorting

Introduction to sorted() and the key Parameter

Python’s sorted() function returns a new sorted list from any iterable. While basic sorting works fine for simple lists, real-world data rarely cooperates. You’ll need to sort users by registration date, products by price, or log entries by severity. The key parameter makes this possible.

The key parameter accepts a function that transforms each element before comparison. Python applies this function to every item, then sorts based on the transformed values—but returns the original items in the new order.

# Basic sorting - alphabetical order
words = ['banana', 'Apple', 'cherry', 'date']
print(sorted(words))
# ['Apple', 'banana', 'cherry', 'date']

# With key - sort by string length
print(sorted(words, key=len))
# ['date', 'Apple', 'banana', 'cherry']

# The original items are returned, not the lengths

The key function receives one argument (the current element) and returns a value Python can compare. This separation between “what to compare” and “what to return” gives you precise control over sorting behavior.

Using Lambda Functions as Keys

Lambda expressions provide inline, anonymous functions perfect for simple transformations. They’re the most common way to define custom sort keys.

# Sort numbers by absolute value
numbers = [-5, 2, -1, 8, -3]
print(sorted(numbers, key=lambda x: abs(x)))
# [-1, 2, -3, -5, 8]

# Sort strings by their last character
names = ['alice', 'bob', 'charlie', 'david']
print(sorted(names, key=lambda x: x[-1]))
# ['bob', 'alice', 'charlie', 'david']

Tuples deserve special attention. When you sort a list of tuples, Python compares them element by element. But often you want to sort by a specific position:

# List of (name, score) tuples
students = [
    ('Alice', 85),
    ('Bob', 92),
    ('Charlie', 78),
    ('Diana', 92)
]

# Sort by score (second element)
by_score = sorted(students, key=lambda x: x[1])
print(by_score)
# [('Charlie', 78), ('Alice', 85), ('Bob', 92), ('Diana', 92)]

# Sort by score descending
by_score_desc = sorted(students, key=lambda x: x[1], reverse=True)
print(by_score_desc)
# [('Bob', 92), ('Diana', 92), ('Alice', 85), ('Charlie', 78)]

Sorting Complex Objects

Dictionaries and custom objects require key functions that extract the relevant field or attribute.

# Sorting a list of dictionaries
users = [
    {'name': 'Alice', 'age': 30, 'city': 'NYC'},
    {'name': 'Bob', 'age': 25, 'city': 'LA'},
    {'name': 'Charlie', 'age': 35, 'city': 'Chicago'},
]

# Sort by age
by_age = sorted(users, key=lambda u: u['age'])
print([u['name'] for u in by_age])
# ['Bob', 'Alice', 'Charlie']

# Sort by name
by_name = sorted(users, key=lambda u: u['name'])
print([u['name'] for u in by_name])
# ['Alice', 'Bob', 'Charlie']

For custom classes, access attributes directly:

class Employee:
    def __init__(self, name, salary, department):
        self.name = name
        self.salary = salary
        self.department = department
    
    def __repr__(self):
        return f"Employee({self.name}, ${self.salary})"

employees = [
    Employee('Alice', 75000, 'Engineering'),
    Employee('Bob', 65000, 'Marketing'),
    Employee('Charlie', 85000, 'Engineering'),
]

# Sort by salary
by_salary = sorted(employees, key=lambda e: e.salary)
print(by_salary)
# [Employee(Bob, $65000), Employee(Alice, $75000), Employee(Charlie, $85000)]

# Sort by name
by_name = sorted(employees, key=lambda e: e.name)
print(by_name)
# [Employee(Alice, $75000), Employee(Bob, $65000), Employee(Charlie, $85000)]

Using operator Module Functions

The operator module provides itemgetter, attrgetter, and methodcaller as optimized alternatives to lambdas. They’re faster because they’re implemented in C, and they’re more readable for common patterns.

from operator import itemgetter, attrgetter

# itemgetter for sequences and dictionaries
students = [('Alice', 85), ('Bob', 92), ('Charlie', 78)]
by_score = sorted(students, key=itemgetter(1))
print(by_score)
# [('Charlie', 78), ('Alice', 85), ('Bob', 92)]

# Works with dictionaries too
users = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob', 'age': 25},
]
by_age = sorted(users, key=itemgetter('age'))
print(by_age)
# [{'name': 'Bob', 'age': 25}, {'name': 'Alice', 'age': 30}]

For objects with attributes, use attrgetter:

from operator import attrgetter

# Using the Employee class from before
employees = [
    Employee('Alice', 75000, 'Engineering'),
    Employee('Bob', 65000, 'Marketing'),
    Employee('Charlie', 85000, 'Engineering'),
]

by_salary = sorted(employees, key=attrgetter('salary'))
print(by_salary)
# [Employee(Bob, $65000), Employee(Alice, $75000), Employee(Charlie, $85000)]

The performance difference matters when sorting large collections:

import timeit
from operator import itemgetter

data = [(i, i * 2) for i in range(10000)]

# Lambda approach
lambda_time = timeit.timeit(
    lambda: sorted(data, key=lambda x: x[1]),
    number=1000
)

# itemgetter approach
itemgetter_time = timeit.timeit(
    lambda: sorted(data, key=itemgetter(1)),
    number=1000
)

print(f"Lambda: {lambda_time:.3f}s")
print(f"itemgetter: {itemgetter_time:.3f}s")
# Typically itemgetter is 10-20% faster

Multi-Level Sorting

Real applications often require sorting by multiple criteria. Python’s sort is stable, meaning equal elements maintain their relative order. You can exploit this with multiple passes, but returning a tuple from your key function is cleaner:

from operator import itemgetter

employees = [
    {'name': 'Alice', 'dept': 'Engineering', 'salary': 75000},
    {'name': 'Bob', 'dept': 'Marketing', 'salary': 65000},
    {'name': 'Charlie', 'dept': 'Engineering', 'salary': 85000},
    {'name': 'Diana', 'dept': 'Marketing', 'salary': 70000},
    {'name': 'Eve', 'dept': 'Engineering', 'salary': 75000},
]

# Sort by department, then by salary within each department
sorted_emps = sorted(employees, key=lambda e: (e['dept'], e['salary']))
for emp in sorted_emps:
    print(f"{emp['dept']:12} {emp['name']:10} ${emp['salary']}")

# Engineering  Alice      $75000
# Engineering  Eve        $75000
# Engineering  Charlie    $85000
# Marketing    Bob        $65000
# Marketing    Diana      $70000

For mixed ascending/descending sorts, negate numeric values or use multiple passes:

# Sort by department ascending, salary descending
sorted_emps = sorted(employees, key=lambda e: (e['dept'], -e['salary']))
for emp in sorted_emps:
    print(f"{emp['dept']:12} {emp['name']:10} ${emp['salary']}")

# Engineering  Charlie    $85000
# Engineering  Alice      $75000
# Engineering  Eve        $75000
# Marketing    Diana      $70000
# Marketing    Bob        $65000

With itemgetter, you can select multiple keys at once:

# Sort by department, then by name (both ascending)
sorted_emps = sorted(employees, key=itemgetter('dept', 'name'))

Case-Insensitive and Locale-Aware Sorting

String sorting is case-sensitive by default, with uppercase letters sorting before lowercase. Use str.lower or str.casefold for case-insensitive sorting:

names = ['alice', 'Bob', 'CHARLIE', 'Diana']

# Default: uppercase first
print(sorted(names))
# ['Bob', 'CHARLIE', 'Diana', 'alice']

# Case-insensitive with str.lower
print(sorted(names, key=str.lower))
# ['alice', 'Bob', 'CHARLIE', 'Diana']

# str.casefold handles more Unicode edge cases
print(sorted(names, key=str.casefold))
# ['alice', 'Bob', 'CHARLIE', 'Diana']

Use str.casefold() over str.lower() when handling international text. It handles cases like the German ß character correctly.

For locale-aware sorting (respecting language-specific ordering rules), use the locale module:

import locale

# Set locale (system-dependent)
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

words = ['café', 'apple', 'Banana']
sorted_words = sorted(words, key=locale.strxfrm)

Common Pitfalls and Best Practices

Handling None Values

None can’t be compared with other types in Python 3, causing TypeError. Handle it explicitly:

data = [3, 1, None, 4, None, 2]

# This raises TypeError
# sorted(data)

# Solution 1: Put None values at the end
sorted_data = sorted(data, key=lambda x: (x is None, x))
print(sorted_data)
# [1, 2, 3, 4, None, None]

# Solution 2: Put None values at the beginning
sorted_data = sorted(data, key=lambda x: (x is not None, x))
print(sorted_data)
# [None, None, 1, 2, 3, 4]

# Solution 3: Replace None with a default
sorted_data = sorted(data, key=lambda x: x if x is not None else float('inf'))
print(sorted_data)
# [1, 2, 3, 4, None, None]

Expensive Key Functions

Python calls the key function once per element, not once per comparison. But for expensive operations, cache the results using the decorate-sort-undecorate pattern (though sorted() with key does this internally):

# If your key function is expensive, sorted() handles it efficiently
# Each element is transformed exactly once

import re

def extract_number(s):
    """Expensive operation: regex parsing"""
    match = re.search(r'\d+', s)
    return int(match.group()) if match else 0

files = ['file10.txt', 'file2.txt', 'file1.txt']
sorted_files = sorted(files, key=extract_number)
print(sorted_files)
# ['file1.txt', 'file2.txt', 'file10.txt']

When Not to Use key

If you need comparison logic that can’t be expressed as a transformation (like fuzzy matching), you’ll need functools.cmp_to_key to convert an old-style comparison function:

from functools import cmp_to_key

def compare_versions(a, b):
    """Compare version strings like '1.2.3'"""
    a_parts = [int(x) for x in a.split('.')]
    b_parts = [int(x) for x in b.split('.')]
    
    for a_val, b_val in zip(a_parts, b_parts):
        if a_val < b_val:
            return -1
        if a_val > b_val:
            return 1
    return len(a_parts) - len(b_parts)

versions = ['1.2.3', '1.10.1', '1.2.10', '2.0.0']
sorted_versions = sorted(versions, key=cmp_to_key(compare_versions))
print(sorted_versions)
# ['1.2.3', '1.2.10', '1.10.1', '2.0.0']

Prefer key functions when possible—they’re more efficient than comparison functions and fit Python 3’s design philosophy.

The key parameter transforms sorted() from a basic utility into a flexible tool for any ordering requirement. Master these patterns, and you’ll handle complex sorting tasks with minimal code.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.