Stack Applications: Expression Evaluation and Parentheses Matching

Stacks solve a specific class of problems elegantly: anything involving nested, hierarchical, or reversible operations. The Last-In-First-Out (LIFO) principle directly maps to how we process paired...

Key Insights

  • Stacks naturally mirror the nested structure of expressions and paired delimiters, making them the ideal data structure for parsing and evaluation tasks
  • The shunting yard algorithm converts human-readable infix notation to machine-friendly postfix in a single pass, handling operator precedence without recursion
  • Building a complete expression evaluator requires just two stacks and roughly 100 lines of code, yet this pattern powers everything from spreadsheet formulas to compiler front-ends

Introduction to Stack-Based Parsing

Stacks solve a specific class of problems elegantly: anything involving nested, hierarchical, or reversible operations. The Last-In-First-Out (LIFO) principle directly maps to how we process paired delimiters and operator precedence.

Consider the expression ((a + b) * c). When you encounter the first (, you need to remember it. When you see the second (, you need to remember that too—but you’ll resolve it first. This is exactly what a stack does. The most recent opening delimiter is always the one that should match the next closing delimiter.

This same principle applies to expression evaluation. When you see 2 + 3 * 4, you can’t immediately compute 2 + 3 because * has higher precedence. You need to “remember” the 2 + while you resolve 3 * 4. A stack holds these deferred operations until you’re ready to apply them.

Parentheses Matching Fundamentals

The balanced parentheses problem is the canonical stack application. Given a string containing brackets, determine if every opener has a corresponding closer in the correct order.

The algorithm is straightforward:

  1. Push opening brackets onto the stack
  2. When you see a closing bracket, pop and verify it matches
  3. If the stack is empty at the end, the string is balanced

Here’s a clean implementation:

def is_balanced(text: str) -> bool:
    pairs = {')': '(', ']': '[', '}': '{'}
    openers = set(pairs.values())
    stack = []
    
    for char in text:
        if char in openers:
            stack.append(char)
        elif char in pairs:
            if not stack or stack.pop() != pairs[char]:
                return False
    
    return len(stack) == 0

Edge cases matter here. An empty string is balanced. A string with only closers fails immediately. A string with unmatched openers fails the final check. The function handles all three correctly.

For production code, you’ll want more than a boolean. You need to know where the error occurred:

def validate_brackets(text: str) -> tuple[bool, str]:
    pairs = {')': '(', ']': '[', '}': '{'}
    openers = set(pairs.values())
    stack = []  # Store (char, position) tuples
    
    for pos, char in enumerate(text):
        if char in openers:
            stack.append((char, pos))
        elif char in pairs:
            if not stack:
                return False, f"Unmatched '{char}' at position {pos}"
            opener, open_pos = stack.pop()
            if opener != pairs[char]:
                return False, f"Mismatched '{char}' at position {pos}, expected match for '{opener}' at {open_pos}"
    
    if stack:
        char, pos = stack[-1]
        return False, f"Unclosed '{char}' at position {pos}"
    
    return True, "Valid"

Extending to Real-World Syntax Validation

The same pattern scales to complex syntax validation. HTML and XML tags are just named parentheses. Code blocks in languages like Python or Ruby follow similar nesting rules.

Here’s an HTML tag validator with proper error reporting:

import re
from dataclasses import dataclass

@dataclass
class TagInfo:
    name: str
    line: int
    column: int

def validate_html_tags(html: str) -> tuple[bool, str]:
    tag_pattern = re.compile(r'<(/?)(\w+)[^>]*>')
    stack: list[TagInfo] = []
    
    line = 1
    line_start = 0
    
    for match in tag_pattern.finditer(html):
        # Track line numbers
        newlines = html.count('\n', line_start, match.start())
        if newlines:
            line += newlines
            line_start = html.rfind('\n', line_start, match.start()) + 1
        
        column = match.start() - line_start + 1
        is_closing = match.group(1) == '/'
        tag_name = match.group(2).lower()
        
        # Skip self-closing tags
        if tag_name in {'br', 'hr', 'img', 'input', 'meta', 'link'}:
            continue
        
        if not is_closing:
            stack.append(TagInfo(tag_name, line, column))
        else:
            if not stack:
                return False, f"Unexpected closing </{tag_name}> at line {line}, column {column}"
            
            opener = stack.pop()
            if opener.name != tag_name:
                return False, (f"Mismatched tags: <{opener.name}> at line {opener.line}, "
                             f"column {opener.column} closed by </{tag_name}> at line {line}, column {column}")
    
    if stack:
        tag = stack[-1]
        return False, f"Unclosed <{tag.name}> at line {tag.line}, column {tag.column}"
    
    return True, "Valid HTML structure"

This validator provides actionable error messages. Instead of “invalid HTML,” users see exactly which tag is problematic and where to find it.

Infix to Postfix Conversion (Shunting Yard Algorithm)

Infix notation (3 + 4 * 2) is human-friendly but machine-hostile. Operator precedence creates ambiguity that requires lookahead to resolve. Postfix notation (3 4 2 * +) eliminates this entirely—operators appear in the exact order they should be applied.

Dijkstra’s shunting yard algorithm converts infix to postfix in a single left-to-right pass. The name comes from its resemblance to a railroad shunting yard where cars are sorted onto different tracks.

The rules:

  1. Numbers go directly to output
  2. Operators go to a stack, but first pop any higher-precedence operators to output
  3. Left parentheses go to the stack
  4. Right parentheses pop operators to output until matching left parenthesis
  5. At the end, pop remaining operators to output
def infix_to_postfix(expression: str) -> list[str]:
    precedence = {'+': 1, '-': 1, '*': 2, '/': 2, '^': 3}
    right_associative = {'^'}
    
    output = []
    operator_stack = []
    
    tokens = tokenize(expression)  # Splits into numbers and operators
    
    for token in tokens:
        if is_number(token):
            output.append(token)
        elif token == '(':
            operator_stack.append(token)
        elif token == ')':
            while operator_stack and operator_stack[-1] != '(':
                output.append(operator_stack.pop())
            if not operator_stack:
                raise ValueError("Mismatched parentheses")
            operator_stack.pop()  # Remove the '('
        elif token in precedence:
            while (operator_stack and 
                   operator_stack[-1] != '(' and
                   operator_stack[-1] in precedence and
                   (precedence[operator_stack[-1]] > precedence[token] or
                    (precedence[operator_stack[-1]] == precedence[token] and 
                     token not in right_associative))):
                output.append(operator_stack.pop())
            operator_stack.append(token)
    
    while operator_stack:
        op = operator_stack.pop()
        if op == '(':
            raise ValueError("Mismatched parentheses")
        output.append(op)
    
    return output

def tokenize(expression: str) -> list[str]:
    tokens = []
    current_number = []
    
    for char in expression:
        if char.isdigit() or char == '.':
            current_number.append(char)
        else:
            if current_number:
                tokens.append(''.join(current_number))
                current_number = []
            if char in '+-*/^()':
                tokens.append(char)
            elif not char.isspace():
                raise ValueError(f"Unknown character: {char}")
    
    if current_number:
        tokens.append(''.join(current_number))
    
    return tokens

def is_number(token: str) -> bool:
    try:
        float(token)
        return True
    except ValueError:
        return False

Postfix Expression Evaluation

Postfix evaluation is remarkably simple. Scan left to right: push numbers, apply operators to the top two stack values, push the result.

def evaluate_postfix(tokens: list[str]) -> float:
    stack = []
    
    operations = {
        '+': lambda a, b: a + b,
        '-': lambda a, b: a - b,
        '*': lambda a, b: a * b,
        '/': lambda a, b: a / b,
        '^': lambda a, b: a ** b,
    }
    
    for token in tokens:
        if token in operations:
            if len(stack) < 2:
                raise ValueError(f"Insufficient operands for '{token}'")
            b = stack.pop()
            a = stack.pop()
            if token == '/' and b == 0:
                raise ValueError("Division by zero")
            stack.append(operations[token](a, b))
        else:
            stack.append(float(token))
    
    if len(stack) != 1:
        raise ValueError("Invalid expression: too many operands")
    
    return stack[0]

The beauty of postfix is that there’s no ambiguity. 3 4 2 * + always means “multiply 4 by 2, then add 3.” No precedence rules needed during evaluation.

Building a Complete Expression Evaluator

Combining these pieces yields a functional calculator:

class Calculator:
    def __init__(self):
        self.precedence = {'+': 1, '-': 1, '*': 2, '/': 2, '^': 3}
        self.right_associative = {'^'}
        self.operations = {
            '+': lambda a, b: a + b,
            '-': lambda a, b: a - b,
            '*': lambda a, b: a * b,
            '/': lambda a, b: a / b,
            '^': lambda a, b: a ** b,
        }
    
    def evaluate(self, expression: str) -> float:
        tokens = self._tokenize(expression)
        postfix = self._to_postfix(tokens)
        return self._evaluate_postfix(postfix)
    
    def _tokenize(self, expression: str) -> list[str]:
        tokens = []
        i = 0
        while i < len(expression):
            if expression[i].isdigit() or expression[i] == '.':
                j = i
                while j < len(expression) and (expression[j].isdigit() or expression[j] == '.'):
                    j += 1
                tokens.append(expression[i:j])
                i = j
            elif expression[i] in '+-*/^()':
                tokens.append(expression[i])
                i += 1
            elif expression[i].isspace():
                i += 1
            else:
                raise ValueError(f"Unexpected character: '{expression[i]}'")
        return tokens
    
    def _to_postfix(self, tokens: list[str]) -> list[str]:
        output = []
        ops = []
        
        for token in tokens:
            if self._is_number(token):
                output.append(token)
            elif token == '(':
                ops.append(token)
            elif token == ')':
                while ops and ops[-1] != '(':
                    output.append(ops.pop())
                if not ops:
                    raise ValueError("Mismatched parentheses")
                ops.pop()
            elif token in self.precedence:
                while (ops and ops[-1] in self.precedence and
                       (self.precedence[ops[-1]] > self.precedence[token] or
                        (self.precedence[ops[-1]] == self.precedence[token] and
                         token not in self.right_associative))):
                    output.append(ops.pop())
                ops.append(token)
        
        while ops:
            if ops[-1] == '(':
                raise ValueError("Mismatched parentheses")
            output.append(ops.pop())
        
        return output
    
    def _evaluate_postfix(self, tokens: list[str]) -> float:
        stack = []
        for token in tokens:
            if token in self.operations:
                b, a = stack.pop(), stack.pop()
                stack.append(self.operations[token](a, b))
            else:
                stack.append(float(token))
        return stack[0]
    
    def _is_number(self, token: str) -> bool:
        try:
            float(token)
            return True
        except ValueError:
            return False

# Usage
calc = Calculator()
print(calc.evaluate("3 + 4 * 2"))        # 11.0
print(calc.evaluate("(3 + 4) * 2"))      # 14.0
print(calc.evaluate("2 ^ 3 ^ 2"))        # 512.0 (right-associative)

Performance and Practical Considerations

Both parentheses matching and expression evaluation run in O(n) time and O(n) space, where n is the input length. Each character or token is processed exactly once, and the stack never exceeds the input size.

These patterns appear everywhere in production systems. Compilers use them for parsing expressions. IDEs use them for bracket matching and syntax highlighting. Spreadsheet applications evaluate cell formulas this way. Linters validate code structure using the same stack-based approach.

When extending this code, consider: unary operators (negation requires context-aware tokenization), function calls (treat function names like left parentheses), and variables (look them up during postfix evaluation). Each extension adds complexity but follows the same fundamental stack-based pattern.

The stack is one of the simplest data structures, yet it solves these parsing problems more elegantly than any alternative. That’s the mark of a good abstraction—simple primitives that compose into powerful solutions.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.