Tortoise and Hare: Cycle Detection in Sequences

Key Insights

Floyd’s cycle detection algorithm uses two pointers moving at different speeds to detect cycles in O(n) time with O(1) space—a significant improvement over hash-based approaches that require O(n) memory.
The algorithm works on any sequence defined by a deterministic function, not just linked lists—this makes it applicable to problems ranging from duplicate detection to cryptographic attacks.
Finding the cycle’s starting point requires a second phase where one pointer resets to the beginning, and both move at equal speed until they meet at the cycle entry.

The Problem of Cycles

Cycles lurk in many computational problems. A linked list with a corrupted tail pointer creates an infinite traversal. A web crawler following redirects can get trapped in a loop. A state machine processing events might revisit the same configuration forever.

Detecting these cycles isn’t just about preventing infinite loops—it’s often the key to solving the actual problem. Finding a duplicate number in an array, determining if a number eventually reaches 1 through a specific transformation, or factoring large integers all reduce to cycle detection.

The question isn’t whether you’ll encounter this problem. It’s whether you’ll recognize it when you do.

The Naive Approach (And Why It Falls Short)

The obvious solution is simple: track every element you’ve seen. If you encounter something twice, you’ve found a cycle.

def has_cycle_hashset(head):
    """Detect cycle using a hash set. O(n) time, O(n) space."""
    seen = set()
    current = head
    
    while current is not None:
        if id(current) in seen:
            return True
        seen.add(id(current))
        current = current.next
    
    return False

This works. It’s readable. It’s correct. And for many applications, it’s perfectly fine.

But consider the constraints. If your linked list contains a million nodes, you’re storing a million references. If you’re detecting cycles in a sequence of 64-bit integers, your hash set might consume gigabytes of memory. In embedded systems or memory-constrained environments, this approach becomes impractical or impossible.

There’s also the hidden cost of hashing. Hash set operations are O(1) amortized, but they involve memory allocation, hash computation, and potential collision resolution. When you’re processing billions of elements, these costs add up.

We need something better.

Floyd’s Algorithm Explained

Robert Floyd proposed an elegant solution in 1967 that uses only two pointers and constant memory. The idea is deceptively simple: use two pointers that traverse the sequence at different speeds.

The tortoise moves one step at a time. The hare moves two steps at a time. If there’s no cycle, the hare reaches the end and we’re done. If there is a cycle, both pointers eventually enter it—and here’s the key insight—the hare will eventually catch up to the tortoise from behind.

Think of it like two runners on a circular track. The faster runner will eventually lap the slower one. They must meet.

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def has_cycle(head):
    """Floyd's cycle detection. O(n) time, O(1) space."""
    if head is None or head.next is None:
        return False
    
    slow = head        # Tortoise
    fast = head.next   # Hare starts one ahead
    
    while slow != fast:
        if fast is None or fast.next is None:
            return False  # Reached end, no cycle
        slow = slow.next
        fast = fast.next.next
    
    return True  # Pointers met, cycle exists

The algorithm terminates in one of two ways: either the fast pointer reaches a null reference (no cycle), or the two pointers meet (cycle detected). There’s no third option.

The Math Behind the Magic

Why does this work? Let’s prove it.

Assume the linked list has a “tail” of length T leading into a cycle of length C. When the tortoise enters the cycle, it has traveled T steps. At this moment, the hare has traveled 2T steps, which means it’s at position (2T - T) mod C = T mod C within the cycle.

Let’s say the hare is k steps ahead of the tortoise within the cycle, where k = T mod C. Each subsequent step, the tortoise advances 1 position and the hare advances 2. The gap between them increases by 1 step per iteration.

But we’re on a cycle. The hare isn’t getting further ahead—it’s catching up from behind. After C - k more steps, the hare will have gained C - k positions, putting it exactly k + (C - k) = C positions ahead. On a cycle of length C, that means they’re at the same position.

The tortoise travels at most T + C steps before they meet. The hare travels at most 2(T + C) steps. Both are O(n) where n is the total number of nodes.

Space complexity? Two pointers. O(1). That’s the whole point.

Finding the Cycle Start Point

Detecting a cycle is often just the first step. We frequently need to know where the cycle begins. Floyd’s algorithm has a beautiful second phase that finds exactly this.

Here’s the mathematical insight. When the tortoise and hare meet, they’re at some position within the cycle. Let’s call the meeting point M steps into the cycle. The tortoise has traveled T + M steps total. The hare has traveled 2(T + M) steps.

Since the hare travels exactly twice as far, and they’re at the same position, the extra distance T + M must be a multiple of the cycle length: T + M = kC for some integer k.

Rearranging: T = kC - M. This means if we start a new pointer at the head and move it T steps, it reaches the cycle start. Meanwhile, if we move our meeting-point pointer T steps forward within the cycle, it travels kC - M steps, which brings it to position (M + kC - M) mod C = 0—also the cycle start.

Both pointers reach the cycle start after exactly T steps. We don’t need to know T; we just move both pointers one step at a time until they meet.

def find_cycle_start(head):
    """Find the node where the cycle begins. Returns None if no cycle."""
    if head is None or head.next is None:
        return None
    
    # Phase 1: Detect cycle
    slow = head
    fast = head
    
    while fast is not None and fast.next is not None:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            break
    else:
        return None  # No cycle
    
    # Phase 2: Find cycle start
    # Reset one pointer to head, keep other at meeting point
    pointer1 = head
    pointer2 = slow
    
    while pointer1 != pointer2:
        pointer1 = pointer1.next
        pointer2 = pointer2.next
    
    return pointer1  # Cycle start node

Beyond Linked Lists: Functional Cycle Detection

Floyd’s algorithm doesn’t require a linked list. It works on any sequence where the next element is determined by a function applied to the current element. If x_{n+1} = f(x_n) for some function f, we can detect cycles.

This abstraction opens up powerful applications. Consider finding a duplicate in an array where values are in the range [1, n] for an array of length n + 1. We can treat the array as a function: f(i) = nums[i]. Starting from index 0, we follow this function and look for a cycle—the duplicate number is where the cycle begins.

def find_duplicate(nums):
    """
    Find duplicate in array where nums[i] is in [1, n] for array of length n+1.
    Uses the array as an implicit linked list.
    """
    # Phase 1: Find meeting point
    slow = nums[0]
    fast = nums[0]
    
    while True:
        slow = nums[slow]
        fast = nums[nums[fast]]
        if slow == fast:
            break
    
    # Phase 2: Find cycle entrance (the duplicate)
    slow = nums[0]
    while slow != fast:
        slow = nums[slow]
        fast = nums[fast]
    
    return slow

This solves LeetCode’s “Find the Duplicate Number” in O(n) time and O(1) space—constraints that rule out sorting or hash sets.

The same principle powers Pollard’s rho algorithm for integer factorization, where the sequence is defined by f(x) = (x² + c) mod n. Cycle detection helps find non-trivial factors of composite numbers.

Practical Applications and Edge Cases

The “Happy Number” problem demonstrates Floyd’s algorithm on a purely mathematical sequence. A happy number eventually reaches 1 when you repeatedly sum the squares of its digits. An unhappy number enters a cycle that never includes 1.

def is_happy(n):
    """Determine if n is a happy number using Floyd's algorithm."""
    def get_next(num):
        total = 0
        while num > 0:
            digit = num % 10
            total += digit * digit
            num //= 10
        return total
    
    slow = n
    fast = get_next(n)
    
    while fast != 1 and slow != fast:
        slow = get_next(slow)
        fast = get_next(get_next(fast))
    
    return fast == 1

Watch out for edge cases. Empty inputs and single elements need explicit handling—there’s nothing for pointers to traverse. Sequences without cycles terminate when the fast pointer hits a boundary. Always verify your termination conditions handle both cyclic and acyclic inputs.

Performance benchmarks consistently show Floyd’s algorithm matching or beating hash-based approaches in execution time while dramatically reducing memory usage. The constant factors are small: pointer comparisons and assignments are cheap operations.

When to Reach for This Tool

Use Floyd’s algorithm when memory matters, when you’re working with implicit sequences defined by functions, or when the problem structure maps naturally to cycle detection. The “duplicate number” problem doesn’t mention cycles anywhere—recognizing the underlying structure is the real skill.

The tortoise and hare pattern appears in disguise across competitive programming, systems debugging, and algorithm design. Once you internalize it, you’ll spot applications everywhere.