Fisher-Yates Shuffle: Unbiased Random Permutation

Key Insights

Naive shuffle algorithms that swap each element with a random position produce biased results because they generate n^n possible outcomes instead of the n! permutations needed for uniform distribution.
The Fisher-Yates algorithm achieves perfect uniformity by progressively shrinking the selection range, ensuring each of the n! permutations has exactly equal probability.
Most programming languages provide built-in shuffle functions, but understanding Fisher-Yates helps you recognize when implementations are broken and how to properly shuffle partial arrays or streams.

Introduction: Why Shuffling Is Harder Than It Looks

Shuffling an array seems trivial. Loop through, swap things around randomly, done. This intuition has led countless developers to write broken shuffle implementations that look correct but produce subtly biased results.

The consequences range from annoying to catastrophic. A biased shuffle in a card game means certain hands appear more frequently than they should—players will eventually notice. In A/B testing, bias means your “random” user assignments systematically favor certain groups, invalidating your statistical conclusions. In cryptographic applications, predictable “randomness” becomes an attack vector.

The Fisher-Yates shuffle, published in 1938 and modernized by Richard Durstenfeld in 1964, solves this problem elegantly. It runs in O(n) time, uses O(1) extra space, and produces perfectly uniform permutations. Every serious shuffling implementation uses it or a variant of it.

Let’s understand why naive approaches fail and how Fisher-Yates succeeds.

The Naive Shuffle Problem

The most common broken shuffle looks something like this:

// DON'T USE THIS - produces biased results
function naiveShuffle(array) {
  const n = array.length;
  for (let i = 0; i < n; i++) {
    const j = Math.floor(Math.random() * n); // random index from 0 to n-1
    [array[i], array[j]] = [array[j], array[i]];
  }
  return array;
}

This looks reasonable. For each position, swap it with a randomly chosen position. What’s wrong?

The algorithm makes n independent random choices, each with n possibilities. That’s n^n total possible execution paths. For a 3-element array, that’s 3^3 = 27 paths.

But there are only n! = 6 permutations of 3 elements. Since 27 doesn’t divide evenly by 6, some permutations must be more likely than others. The pigeonhole principle guarantees bias.

Let’s verify this empirically:

import random
from collections import Counter

def naive_shuffle(arr):
    arr = arr.copy()
    n = len(arr)
    for i in range(n):
        j = random.randint(0, n - 1)
        arr[i], arr[j] = arr[j], arr[i]
    return tuple(arr)

def count_permutations(shuffle_func, iterations=100000):
    counts = Counter()
    for _ in range(iterations):
        result = shuffle_func([1, 2, 3])
        counts[result] += 1
    return counts

naive_results = count_permutations(naive_shuffle)
print("Naive shuffle distribution:")
for perm, count in sorted(naive_results.items()):
    expected = 100000 / 6
    deviation = ((count - expected) / expected) * 100
    print(f"  {perm}: {count} ({deviation:+.1f}% from expected)")

Running this reveals the bias clearly. Some permutations appear 4-5% more often than others. Over millions of shuffles, this becomes statistically significant and exploitable.

Fisher-Yates Algorithm Explained

The Fisher-Yates shuffle (Durstenfeld variant) fixes the bias by ensuring exactly n! possible execution paths:

function fisherYatesShuffle(array) {
  for (let i = array.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * (i + 1)); // random index from 0 to i
    [array[i], array[j]] = [array[j], array[i]];
  }
  return array;
}

The key insight: iterate backward, and at each step, only choose from the unshuffled portion.

Here’s what happens for a 4-element array [A, B, C, D]:

i = 3: Pick random index 0-3, swap with position 3. Position 3 is now final.
i = 2: Pick random index 0-2, swap with position 2. Position 2 is now final.
i = 1: Pick random index 0-1, swap with position 1. Position 1 is now final.
Position 0 is the only element left—it’s final by default.

The number of choices: 4 × 3 × 2 × 1 = 4! = 24. Exactly the number of permutations. Each permutation maps to exactly one sequence of choices, guaranteeing uniform distribution.

def fisher_yates_shuffle(arr):
    arr = arr.copy()
    for i in range(len(arr) - 1, 0, -1):
        j = random.randint(0, i)
        arr[i], arr[j] = arr[j], arr[i]
    return tuple(arr)

# Compare with naive
fy_results = count_permutations(fisher_yates_shuffle)
print("\nFisher-Yates distribution:")
for perm, count in sorted(fy_results.items()):
    expected = 100000 / 6
    deviation = ((count - expected) / expected) * 100
    print(f"  {perm}: {count} ({deviation:+.1f}% from expected)")

The Fisher-Yates results cluster tightly around the expected value, with deviations explained entirely by random sampling variance.

Proving Unbiasedness

For a rigorous proof, consider what probability each element has of ending up in each position.

For element X to end up in position n-1 (the last position), it must be selected in the first swap. Probability: 1/n.

For element X to end up in position n-2, it must NOT be selected in the first swap (probability (n-1)/n), then be selected in the second swap (probability 1/(n-1)). Combined: (n-1)/n × 1/(n-1) = 1/n.

This pattern holds for every position. Each element has exactly 1/n probability of landing in any given position.

def test_position_uniformity(shuffle_func, array_size=5, iterations=50000):
    """Test that each element has equal probability of each position."""
    position_counts = [[0] * array_size for _ in range(array_size)]
    
    for _ in range(iterations):
        arr = list(range(array_size))
        result = shuffle_func(arr)
        for pos, val in enumerate(result):
            position_counts[val][pos] += 1
    
    print(f"\nPosition distribution (expected: {iterations/array_size:.0f} each):")
    for val in range(array_size):
        counts = position_counts[val]
        print(f"  Element {val}: {counts}")
        
test_position_uniformity(fisher_yates_shuffle)
test_position_uniformity(naive_shuffle)  # Shows clear bias

Implementation Pitfalls

Even knowing the algorithm, several mistakes can reintroduce bias.

Modulo Bias

If your random number generator produces values 0 to MAX, and you compute rand() % n, you get modulo bias when MAX+1 isn’t divisible by n.

# BAD: Modulo bias when MAX isn't divisible by n
def biased_random(n, max_val=32767):
    return random.randint(0, max_val) % n

# GOOD: Rejection sampling eliminates bias
def unbiased_random(n, max_val=32767):
    limit = max_val - (max_val % n)
    while True:
        val = random.randint(0, max_val)
        if val < limit:
            return val % n

Most modern languages handle this in their standard libraries, but embedded systems and cryptographic code often need manual handling.

Off-by-One Errors

The loop bounds matter critically:

// WRONG: Includes i in the range, causing bias
for (let i = array.length - 1; i >= 0; i--) {
  const j = Math.floor(Math.random() * array.length); // Should be (i + 1)
  [array[i], array[j]] = [array[j], array[i]];
}

// WRONG: Excludes i from swapping with itself
for (let i = array.length - 1; i > 0; i--) {
  const j = Math.floor(Math.random() * i); // Should be (i + 1)
  [array[i], array[j]] = [array[j], array[i]];
}

// CORRECT
for (let i = array.length - 1; i > 0; i--) {
  const j = Math.floor(Math.random() * (i + 1)); // 0 to i inclusive
  [array[i], array[j]] = [array[j], array[i]];
}

An element must be able to “swap with itself” (stay in place). Excluding this possibility changes the probability distribution.

Variations and Applications

Partial Shuffle (Selecting k Random Elements)

Often you only need k random elements from n. Stop early:

def partial_shuffle(arr, k):
    """Return k random elements from arr."""
    arr = arr.copy()
    k = min(k, len(arr))
    for i in range(len(arr) - 1, len(arr) - k - 1, -1):
        j = random.randint(0, i)
        arr[i], arr[j] = arr[j], arr[i]
    return arr[-k:]

# Get 3 random elements from a deck
random_cards = partial_shuffle(list(range(52)), 3)

This runs in O(k) time instead of O(n).

Inside-Out Variant

When you need a shuffled copy without modifying the original, the inside-out variant builds the result incrementally:

def inside_out_shuffle(source):
    """Create a shuffled copy using inside-out Fisher-Yates."""
    result = []
    for i, item in enumerate(source):
        j = random.randint(0, i)
        if j == len(result):
            result.append(item)
        else:
            result.append(result[j])
            result[j] = item
    return result

This is particularly useful for shuffling streams where you don’t know the total size upfront—it’s the foundation of reservoir sampling algorithms.

When to Use Built-in vs Roll Your Own

Most languages provide correct shuffle implementations:

Python: random.shuffle() uses Fisher-Yates
JavaScript: No built-in, but array.sort(() => Math.random() - 0.5) is WRONG
Java: Collections.shuffle() uses Fisher-Yates
C++: std::shuffle() with a proper random engine

Use the built-in when available. Roll your own only when:

You need partial shuffles and want O(k) instead of O(n)
You’re working in an environment without standard libraries
You need cryptographically secure shuffling with specific RNG requirements
You’re implementing reservoir sampling or streaming algorithms

Always test shuffle implementations statistically before deploying them in production. The chi-squared test against expected uniform distribution catches most bias issues with sufficient sample sizes.

The Fisher-Yates shuffle is one of those algorithms every developer should understand. It’s simple, elegant, and solves a problem that’s deceptively easy to get wrong.