Threaded Binary Tree: In-Order Traversal Without Stack

Key Insights

Threaded binary trees repurpose null pointers to store in-order traversal links, enabling O(1) space traversal instead of O(h) stack space
Morris traversal temporarily creates threads during traversal and restores the original tree structure, making it both space-efficient and non-destructive
Threading adds complexity to insertion and deletion operations, making it most valuable in memory-constrained environments or when traversal frequency far exceeds modification frequency

The Stack Problem in Tree Traversal

Every time you write a recursive in-order traversal, you’re paying a hidden cost. That elegant three-line function consumes O(h) stack space, where h is the tree height. For a balanced tree with a million nodes, that’s roughly 20 stack frames. For an unbalanced tree? You’re looking at potential stack overflow.

The iterative approach with an explicit stack doesn’t solve the problem—it just makes the space consumption visible. You’re still allocating O(h) memory for traversal state.

Threaded binary trees offer a fundamentally different approach. By exploiting the observation that a binary tree with n nodes has n+1 null pointers, we can repurpose this wasted space to enable O(1) traversal. It’s one of those ideas that feels obvious in retrospect but took decades to formalize.

What is a Threaded Binary Tree?

In a standard binary tree, leaf nodes and nodes with single children contain null pointers. A threaded binary tree converts these null pointers into “threads”—pointers to the node’s in-order predecessor or successor.

Consider a node with no right child. In a standard tree, that right pointer is null. In a right-threaded tree, it points to the node’s in-order successor. This creates a shortcut for traversal: instead of backtracking up the tree, you follow the thread directly to the next node.

Single threading uses only right null pointers to point to in-order successors. Double threading additionally uses left null pointers to point to in-order predecessors, enabling bidirectional traversal.

Standard BST:              Right-Threaded BST:

       4                          4
      / \                        / \
     2   5                      2   5
    / \   \                    / \   \
   1   3   6                  1-->3-->6
                              |   |
                              v   v
                              2   4   (threads shown as arrows)

The critical implementation detail is distinguishing threads from actual child pointers. We need boolean flags:

typedef struct ThreadedNode {
    int data;
    struct ThreadedNode *left;
    struct ThreadedNode *right;
    bool left_is_thread;   // true if left points to predecessor
    bool right_is_thread;  // true if right points to successor
} ThreadedNode;

ThreadedNode* create_node(int data) {
    ThreadedNode *node = malloc(sizeof(ThreadedNode));
    node->data = data;
    node->left = NULL;
    node->right = NULL;
    node->left_is_thread = true;   // null pointers are threads by default
    node->right_is_thread = true;
    return node;
}

The boolean flags consume additional memory per node, but this overhead is typically negligible compared to the traversal space savings, especially for large trees.

Building a Threaded Binary Tree

Converting an existing BST to a threaded tree requires establishing the in-order relationships. The most elegant approach uses reverse in-order traversal (right-node-left), maintaining a pointer to the previously visited node:

void convert_to_threaded(ThreadedNode *root, ThreadedNode **prev) {
    if (root == NULL) return;
    
    // Process right subtree first (reverse in-order)
    convert_to_threaded(root->right, prev);
    
    // If right child is null, thread it to successor
    if (root->right == NULL) {
        root->right = *prev;
        root->right_is_thread = true;
    } else {
        root->right_is_thread = false;
    }
    
    // Current node becomes predecessor for next iteration
    *prev = root;
    
    // Process left subtree
    convert_to_threaded(root->left, prev);
    
    // Thread left null pointer to predecessor
    if (root->left == NULL && *prev != root) {
        // Left threading would point backward in traversal
        // Handled separately for double-threading
    } else if (root->left != NULL) {
        root->left_is_thread = false;
    }
}

void make_threaded(ThreadedNode *root) {
    ThreadedNode *prev = NULL;
    convert_to_threaded(root, &prev);
}

This conversion runs in O(n) time and uses O(h) stack space for the recursion. Ironic, given our goal is O(1) traversal—but this is a one-time cost.

Morris Traversal: In-Order Without Extra Space

Morris traversal is the crown jewel of threaded tree algorithms. It achieves O(1) space in-order traversal by temporarily threading the tree during traversal, then restoring the original structure. No permanent modification required.

The algorithm’s insight: before moving to a left subtree, create a temporary thread from the rightmost node of that subtree back to the current node. This thread serves as a “breadcrumb” for returning after processing the subtree.

void morris_inorder(ThreadedNode *root) {
    ThreadedNode *current = root;
    
    while (current != NULL) {
        if (current->left == NULL) {
            // No left subtree: visit node, move right
            printf("%d ", current->data);
            current = current->right;
        } else {
            // Find in-order predecessor (rightmost in left subtree)
            ThreadedNode *predecessor = current->left;
            while (predecessor->right != NULL && 
                   predecessor->right != current) {
                predecessor = predecessor->right;
            }
            
            if (predecessor->right == NULL) {
                // First visit: create thread, go left
                predecessor->right = current;
                current = current->left;
            } else {
                // Second visit: remove thread, visit node, go right
                predecessor->right = NULL;
                printf("%d ", current->data);
                current = current->right;
            }
        }
    }
}

The algorithm visits each node at most twice: once when creating the thread, once when following it back. This gives O(n) time complexity with O(1) auxiliary space.

The temporary threading is invisible to concurrent readers if you’re careful about the order of operations, though true thread-safety requires additional synchronization.

Insertion and Deletion in Threaded Trees

Maintaining threads during modifications is where complexity increases significantly. Every insertion or deletion must update not just parent-child relationships, but also thread pointers of affected nodes.

ThreadedNode* threaded_insert(ThreadedNode *root, int data) {
    ThreadedNode *new_node = create_node(data);
    
    if (root == NULL) {
        return new_node;
    }
    
    ThreadedNode *current = root;
    ThreadedNode *parent = NULL;
    
    // Find insertion point
    while (current != NULL) {
        parent = current;
        if (data < current->data) {
            if (current->left_is_thread) {
                break;  // Found insertion point
            }
            current = current->left;
        } else {
            if (current->right_is_thread) {
                break;  // Found insertion point
            }
            current = current->right;
        }
    }
    
    // Insert and establish threads
    if (data < parent->data) {
        // New node inherits parent's left thread as its left thread
        new_node->left = parent->left;
        new_node->left_is_thread = parent->left_is_thread;
        
        // New node's right thread points to parent (its successor)
        new_node->right = parent;
        new_node->right_is_thread = true;
        
        // Parent's left now points to new node (actual child)
        parent->left = new_node;
        parent->left_is_thread = false;
    } else {
        // New node inherits parent's right thread as its right thread
        new_node->right = parent->right;
        new_node->right_is_thread = parent->right_is_thread;
        
        // New node's left thread points to parent (its predecessor)
        new_node->left = parent;
        new_node->left_is_thread = true;
        
        // Parent's right now points to new node (actual child)
        parent->right = new_node;
        parent->right_is_thread = false;
    }
    
    return root;
}

Deletion is considerably more complex, requiring updates to potentially multiple thread pointers. The in-order predecessor of a deleted node must have its right thread updated to point to the deleted node’s successor, and vice versa.

Performance Analysis and Trade-offs

Operation	Standard BST	Threaded BST
In-order traversal space	O(h)	O(1)
In-order traversal time	O(n)	O(n)
Insertion time	O(h)	O(h)
Insertion complexity	Simple	Moderate
Deletion complexity	Moderate	High
Memory per node	2 pointers	2 pointers + 2 bools

The space savings during traversal come at the cost of implementation complexity and slightly increased per-node memory. Threading makes sense when:

Memory is severely constrained: Embedded systems where stack space is limited
Traversal dominates: Read-heavy workloads with infrequent modifications
Iterator semantics needed: When you need to pause and resume traversal efficiently

Threading doesn’t make sense for write-heavy workloads or when implementation simplicity matters more than traversal efficiency.

Practical Applications

Embedded systems frequently use threaded trees. When your entire system has 64KB of RAM, an O(h) traversal stack is a luxury you can’t afford. Sensor data collection systems often use threaded trees for ordered data storage with minimal memory overhead.

Database cursor implementations benefit from threading. A cursor pointing to a row in a B-tree index can efficiently move to the next row by following thread pointers, without maintaining traversal state.

The Linux kernel’s red-black tree implementation uses a form of threading for its rb_next() and rb_prev() functions, enabling efficient iteration through sorted data structures like the virtual memory area (VMA) tree.

Iterator implementations in memory-constrained environments use threading to provide next() operations in O(1) time without storing traversal state in the iterator object.

The key insight is that threaded trees trade implementation complexity for runtime efficiency. In production systems where traversal is a hot path and memory is constrained, that trade-off often pays off. For typical application code where correctness and maintainability matter more than squeezing out the last byte, stick with the stack-based approach.