Memoization: Caching Function Results

Key Insights

Memoization trades memory for speed by caching function results, but only works correctly with pure functions that have no side effects and return consistent outputs for the same inputs.
Unbounded caches are memory leaks waiting to happen—always implement size limits or TTL expiration in production code.
Before writing custom memoization, reach for built-in solutions like Python’s @functools.lru_cache or React’s useMemo, which handle edge cases you’ll likely miss.

What is Memoization?

Memoization is an optimization technique that caches the results of expensive function calls and returns the cached result when the same inputs occur again. The term comes from the Latin “memorandum” (to be remembered), and that’s exactly what it does—it remembers previous computations.

Don’t confuse memoization with general caching. While both store data for later retrieval, memoization specifically refers to caching function return values based on their arguments. It’s a subset of caching applied at the function level.

Memoization shines when you have pure functions (same inputs always produce same outputs), expensive computations, and repeated calls with identical arguments. The classic example is recursive Fibonacci:

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# fibonacci(40) makes over 300 million function calls
# fibonacci(50) would take minutes to complete

This naive implementation has O(2^n) time complexity. Computing fibonacci(40) calls fibonacci(1) over 100 million times. Each call recomputes values we’ve already calculated. This is exactly the problem memoization solves.

How Memoization Works

The pattern is straightforward: before computing, check if you’ve seen these arguments before. If yes, return the cached result. If no, compute the result, store it, then return it.

def fibonacci_memoized(n, cache=None):
    if cache is None:
        cache = {}
    
    if n in cache:
        return cache[n]
    
    if n <= 1:
        return n
    
    result = fibonacci_memoized(n - 1, cache) + fibonacci_memoized(n - 2, cache)
    cache[n] = result
    return result

# fibonacci_memoized(40) now makes only 79 function calls
# fibonacci_memoized(100) completes instantly

This drops our time complexity from O(2^n) to O(n). We compute each Fibonacci number exactly once.

The trade-off is memory. We’re storing every computed result. For Fibonacci, this is trivial—storing 100 integers uses negligible memory. For functions returning large objects or receiving many unique argument combinations, memory consumption becomes a real concern.

Choosing cache keys matters. For simple arguments like integers or strings, the argument itself works as a key. For complex objects, you need a serialization strategy—often converting arguments to a string representation or using a hash.

Implementing a Generic Memoize Function

Rather than adding caching logic to every function, create a reusable wrapper:

function memoize(fn) {
  const cache = new Map();
  
  return function(...args) {
    const key = JSON.stringify(args);
    
    if (cache.has(key)) {
      return cache.get(key);
    }
    
    const result = fn.apply(this, args);
    cache.set(key, result);
    return result;
  };
}

// Usage
const expensiveCalculation = (x, y) => {
  console.log('Computing...');
  return x * y + Math.pow(x, y);
};

const memoizedCalc = memoize(expensiveCalculation);

memoizedCalc(2, 10);  // Logs "Computing...", returns 1028
memoizedCalc(2, 10);  // Returns 1028 immediately, no log
memoizedCalc(3, 5);   // Logs "Computing...", returns 258

Python’s decorator syntax makes this even cleaner:

def memoize(func):
    cache = {}
    
    def wrapper(*args, **kwargs):
        # Create a hashable key from args and kwargs
        key = (args, tuple(sorted(kwargs.items())))
        
        if key not in cache:
            cache[key] = func(*args, **kwargs)
        
        return cache[key]
    
    wrapper.cache = cache  # Expose cache for debugging/clearing
    return wrapper

@memoize
def expensive_operation(x, y):
    print(f"Computing {x}, {y}...")
    return x ** y

expensive_operation(2, 10)  # Computes
expensive_operation(2, 10)  # Returns cached

The JSON.stringify approach for key generation works for primitive arguments but fails with circular references or when argument order in objects matters. For production code, consider more robust serialization or restrict memoization to functions with simple argument types.

Built-in and Library Solutions

Don’t reinvent the wheel. Most languages provide battle-tested memoization utilities.

Python’s functools module offers two excellent options:

from functools import lru_cache, cache

# @cache is equivalent to @lru_cache(maxsize=None)
# Use when you want unlimited caching
@cache
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# @lru_cache limits cache size using LRU eviction
@lru_cache(maxsize=128)
def expensive_api_parse(url: str, format: str) -> dict:
    # Simulating expensive parsing
    response = fetch_and_parse(url, format)
    return response

# Inspect cache statistics
print(expensive_api_parse.cache_info())
# CacheInfo(hits=45, misses=12, maxsize=128, currsize=12)

# Clear the cache when needed
expensive_api_parse.cache_clear()

In React, useMemo and useCallback provide component-level memoization:

function ExpensiveComponent({ items, filter }) {
  // Only recompute when items or filter change
  const filteredItems = useMemo(() => {
    console.log('Filtering...');
    return items.filter(item => item.category === filter);
  }, [items, filter]);

  // Memoize callback to prevent child re-renders
  const handleClick = useCallback((id) => {
    console.log(`Clicked ${id}`);
  }, []);

  return (
    <ItemList items={filteredItems} onItemClick={handleClick} />
  );
}

These hooks compare dependencies by reference, not value. This is crucial—passing a new array or object reference on every render defeats the purpose entirely.

Cache Invalidation Strategies

Unbounded caches grow forever. In long-running applications, this means memory leaks. You need eviction strategies.

LRU (Least Recently Used) evicts the oldest accessed entries when the cache reaches capacity. Python’s @lru_cache(maxsize=128) implements this automatically.

TTL (Time To Live) expires entries after a duration. This is essential when cached data can become stale:

import time
from functools import wraps

def memoize_with_ttl(ttl_seconds):
    def decorator(func):
        cache = {}
        
        @wraps(func)
        def wrapper(*args):
            now = time.time()
            
            if args in cache:
                result, timestamp = cache[args]
                if now - timestamp < ttl_seconds:
                    return result
            
            result = func(*args)
            cache[args] = (result, now)
            return result
        
        def clear_expired():
            now = time.time()
            expired = [k for k, (_, ts) in cache.items() 
                      if now - ts >= ttl_seconds]
            for k in expired:
                del cache[k]
        
        wrapper.clear_expired = clear_expired
        wrapper.cache = cache
        return wrapper
    
    return decorator

@memoize_with_ttl(ttl_seconds=300)  # 5-minute cache
def fetch_user_profile(user_id):
    return api.get_user(user_id)

For production systems, consider combining strategies: LRU for size limits plus TTL for freshness guarantees.

Common Pitfalls and When Not to Memoize

Memoization breaks spectacularly with impure functions:

counter = 0

@cache
def impure_function(x):
    global counter
    counter += 1  # Side effect!
    return x + counter

impure_function(5)  # Returns 6, counter = 1
impure_function(5)  # Returns 6 (cached), counter still 1
impure_function(5)  # Returns 6, but we expected 8

The cached result doesn’t reflect the side effect that should have occurred. Only memoize pure functions.

Mutable arguments cause subtle bugs:

@cache
def process_list(items: tuple):  # Must use tuple, not list
    return sum(items)

# This fails:
# @cache
# def broken(items: list):  # Lists aren't hashable
#     return sum(items)

Python’s @cache requires hashable arguments, which catches some issues. But in JavaScript, object identity matters:

const memoizedFn = memoize((obj) => obj.value * 2);

const config = { value: 5 };
memoizedFn(config);  // Computes, returns 10

config.value = 10;
memoizedFn(config);  // Returns 10 (cached!), wrong!

The cache key was based on the serialized object at call time. Mutating the object later doesn’t invalidate the cache.

Finally, don’t over-memoize. Caching has overhead—key generation, cache lookups, memory allocation. For cheap functions called with many unique arguments, memoization makes performance worse.

Real-World Applications

Dynamic programming problems are natural fits. The coin change problem demonstrates this:

@cache
def min_coins(amount, coins):
    if amount == 0:
        return 0
    if amount < 0:
        return float('inf')
    
    return 1 + min(
        min_coins(amount - coin, coins) 
        for coin in coins
    )

# Without memoization: exponential time
# With memoization: O(amount * len(coins))
coins = (1, 5, 10, 25)
print(min_coins(67, coins))  # 6 coins: 25+25+10+5+1+1

API response caching reduces load and latency:

@lru_cache(maxsize=1000)
def get_product_details(product_id: str) -> dict:
    response = requests.get(f"{API_BASE}/products/{product_id}")
    return response.json()

React optimization prevents expensive re-renders:

const ChartComponent = React.memo(({ data, options }) => {
  const processedData = useMemo(() => 
    transformDataForChart(data),  // Expensive transformation
    [data]
  );
  
  return <Chart data={processedData} options={options} />;
});

Memoization is a powerful tool, but it’s not magic. Use it deliberately for pure functions with expensive computations and repeated calls. Always bound your caches. And when in doubt, profile first—measure whether memoization actually helps before adding complexity to your codebase.