Design a DNS System: Hierarchical Name Resolution

DNS is the internet's phone book, but calling it that undersells the engineering. It's a globally distributed hierarchical database that handles trillions of queries daily, with no single point of...

Key Insights

  • DNS uses a hierarchical tree structure with delegated authority at each level, enabling distributed management of billions of domain names without centralized coordination
  • The caching layer is what makes DNS practical—without TTL-based caching at every level, root servers would be crushed under query load
  • Recursive resolvers do the heavy lifting for clients, walking the hierarchy and caching results, while authoritative servers only answer for their specific zones

Introduction to DNS Architecture

DNS is the internet’s phone book, but calling it that undersells the engineering. It’s a globally distributed hierarchical database that handles trillions of queries daily, with no single point of failure and no central authority managing it all.

The problem DNS solves is simple: humans remember names, computers need IP addresses. But the solution had to scale to billions of devices and millions of domain names, handle constant updates, and remain resilient to failures. A flat lookup table was never going to work.

The hierarchical design is what makes this possible. Instead of one massive database, DNS splits responsibility across thousands of servers organized in a tree. Each level only needs to know about the next level down, and authority is delegated at each branch. This is the architectural pattern that lets DNS scale.

The DNS Hierarchy Structure

The DNS hierarchy forms an inverted tree with the root at the top. Every domain name is a path through this tree, read right to left: www.example.com means start at root, go to com, then example, then www.

Root Servers sit at the top. There are 13 root server addresses (A through M), but each address maps to hundreds of physical servers worldwide via anycast. Root servers don’t know where www.example.com lives—they only know which servers handle .com.

TLD (Top-Level Domain) Servers manage domains like .com, .org, .io. The .com TLD servers don’t know individual IP addresses either. They know which nameservers are authoritative for example.com.

Authoritative Servers are the final authority for a zone. The nameserver for example.com knows the actual IP address for www.example.com. This is where the buck stops.

Recursive Resolvers (like 8.8.8.8 or your ISP’s resolver) do the work of walking this hierarchy on behalf of clients. Your laptop asks the resolver once; the resolver handles the rest.

Here’s how we’d model these structures:

from dataclasses import dataclass
from enum import Enum
from typing import Optional
import time

class RecordType(Enum):
    A = "A"           # IPv4 address
    AAAA = "AAAA"     # IPv6 address
    CNAME = "CNAME"   # Canonical name (alias)
    NS = "NS"         # Nameserver delegation
    MX = "MX"         # Mail exchange
    TXT = "TXT"       # Text record
    SOA = "SOA"       # Start of authority

@dataclass
class DNSRecord:
    name: str
    record_type: RecordType
    value: str
    ttl: int  # Time to live in seconds
    priority: Optional[int] = None  # For MX records
    
    def is_expired(self, cached_at: float) -> bool:
        return time.time() - cached_at > self.ttl

@dataclass
class Zone:
    name: str  # e.g., "example.com"
    records: dict[str, list[DNSRecord]]
    ns_records: list[str]  # Authoritative nameservers for this zone
    
    def get_records(self, name: str, record_type: RecordType) -> list[DNSRecord]:
        key = f"{name}:{record_type.value}"
        return self.records.get(key, [])
    
    def add_record(self, record: DNSRecord):
        key = f"{record.name}:{record.record_type.value}"
        if key not in self.records:
            self.records[key] = []
        self.records[key].append(record)

Zone delegation is the key concept here. When a TLD server returns NS records for example.com, it’s saying “I don’t have the answer, but these servers do.” Authority is explicitly handed off at each level.

Resolution Process: Recursive vs Iterative Queries

When you type www.example.com in your browser, your operating system sends a query to your configured resolver. What happens next depends on whether the resolver uses recursive or iterative resolution.

Recursive resolution means the resolver does all the work. It queries root servers, follows referrals to TLD servers, then to authoritative servers, and only returns to the client with the final answer. This is what most public resolvers do.

Iterative resolution means each server returns a referral, and the client follows it. The root server says “ask the .com servers,” the .com server says “ask ns1.example.com,” and so on. The client does the walking.

In practice, your laptop uses recursive queries to a resolver, and that resolver uses iterative queries to walk the hierarchy.

class RecursiveResolver:
    def __init__(self, root_servers: list[str]):
        self.root_servers = root_servers
        self.cache = DNSCache(max_size=10000)
    
    def resolve(self, name: str, record_type: RecordType) -> list[DNSRecord]:
        # Check cache first
        cached = self.cache.get(name, record_type)
        if cached:
            return cached
        
        # Start resolution from root
        records = self._resolve_iteratively(name, record_type, self.root_servers)
        
        # Cache the result
        if records:
            self.cache.put(name, record_type, records)
        
        return records
    
    def _resolve_iteratively(
        self, 
        name: str, 
        record_type: RecordType, 
        nameservers: list[str]
    ) -> list[DNSRecord]:
        current_servers = nameservers
        
        while current_servers:
            # Query one of the current nameservers
            response = self._query_server(current_servers[0], name, record_type)
            
            if response.is_authoritative and response.answers:
                # Got the final answer
                return response.answers
            
            if response.is_authoritative and not response.answers:
                # Authoritative NXDOMAIN - name doesn't exist
                return []
            
            if response.referrals:
                # Got a referral - follow it
                # First resolve the NS record IPs if needed
                current_servers = self._resolve_ns_addresses(response.referrals)
            else:
                # No answer and no referral - something's wrong
                break
        
        return []
    
    def _query_server(self, server: str, name: str, record_type: RecordType):
        # In reality, this sends a UDP packet to port 53
        # Returns a DNSResponse with answers, referrals, and flags
        pass
    
    def _resolve_ns_addresses(self, ns_records: list[DNSRecord]) -> list[str]:
        # Recursively resolve NS hostnames to IP addresses
        addresses = []
        for ns in ns_records:
            a_records = self.resolve(ns.value, RecordType.A)
            addresses.extend(r.value for r in a_records)
        return addresses

The recursive resolver is doing significant work here. A single client query might trigger 4-8 upstream queries as the resolver walks the tree. This is why caching is critical.

Caching Layer Design

Without caching, DNS would collapse. Every query would hit root servers, and there are only 13 addresses serving the entire internet.

Each DNS record includes a TTL (Time To Live) set by the authoritative server. A TTL of 3600 means “cache this for one hour.” When the TTL expires, the resolver must re-query.

Caching happens at multiple levels:

  • Browser cache (Chrome caches DNS for ~60 seconds)
  • OS resolver cache
  • Recursive resolver cache (the big one)
  • Sometimes even at authoritative servers for computed responses

Negative caching is equally important. When a domain doesn’t exist (NXDOMAIN), that response is also cached. Without this, typos would hammer authoritative servers repeatedly.

from collections import OrderedDict
from threading import Lock
from typing import Optional
import time

@dataclass
class CacheEntry:
    records: list[DNSRecord]
    cached_at: float
    ttl: int
    
    def is_expired(self) -> bool:
        return time.time() - self.cached_at > self.ttl

class DNSCache:
    def __init__(self, max_size: int = 10000, negative_ttl: int = 300):
        self.max_size = max_size
        self.negative_ttl = negative_ttl  # TTL for NXDOMAIN responses
        self.cache: OrderedDict[str, CacheEntry] = OrderedDict()
        self.lock = Lock()
    
    def _make_key(self, name: str, record_type: RecordType) -> str:
        return f"{name.lower()}:{record_type.value}"
    
    def get(self, name: str, record_type: RecordType) -> Optional[list[DNSRecord]]:
        key = self._make_key(name, record_type)
        
        with self.lock:
            if key not in self.cache:
                return None
            
            entry = self.cache[key]
            
            if entry.is_expired():
                del self.cache[key]
                return None
            
            # Move to end (most recently used)
            self.cache.move_to_end(key)
            return entry.records
    
    def put(
        self, 
        name: str, 
        record_type: RecordType, 
        records: list[DNSRecord]
    ):
        key = self._make_key(name, record_type)
        
        # Use minimum TTL from records, or negative_ttl for empty results
        if records:
            ttl = min(r.ttl for r in records)
        else:
            ttl = self.negative_ttl
        
        with self.lock:
            # Evict oldest if at capacity
            while len(self.cache) >= self.max_size:
                self.cache.popitem(last=False)
            
            self.cache[key] = CacheEntry(
                records=records,
                cached_at=time.time(),
                ttl=ttl
            )

Cache invalidation is DNS’s hard problem. If you change your DNS records, users might see stale data until TTLs expire everywhere. This is why you lower TTLs before migrations and why some operators use very short TTLs (30-60 seconds) despite the performance cost.

Handling Scale and Reliability

DNS must be always available. A DNS outage means the entire internet appears down, even if every website is running fine.

Anycast routing is the primary scaling mechanism for root and TLD servers. The same IP address is announced from multiple physical locations. When you query a root server, BGP routing sends you to the nearest instance. The 13 root server addresses actually represent over 1,500 physical servers globally.

Replication is mandatory for authoritative servers. Every zone should have at least two nameservers on different networks. Zone transfers (AXFR/IXFR) keep replicas synchronized.

Health checking and failover ensure queries go to working servers:

import random
import time
from threading import Thread
from dataclasses import dataclass

@dataclass
class ServerHealth:
    address: str
    healthy: bool = True
    last_check: float = 0
    consecutive_failures: int = 0

class HealthCheckedServerPool:
    def __init__(
        self, 
        servers: list[str], 
        check_interval: int = 10,
        failure_threshold: int = 3
    ):
        self.servers = {s: ServerHealth(address=s) for s in servers}
        self.check_interval = check_interval
        self.failure_threshold = failure_threshold
        self._start_health_checker()
    
    def get_server(self) -> Optional[str]:
        """Return a healthy server using weighted random selection."""
        healthy_servers = [
            s for s in self.servers.values() 
            if s.healthy
        ]
        
        if not healthy_servers:
            # Fallback: try any server
            return random.choice(list(self.servers.keys()))
        
        return random.choice(healthy_servers).address
    
    def report_failure(self, address: str):
        """Called when a query to this server fails."""
        if address in self.servers:
            server = self.servers[address]
            server.consecutive_failures += 1
            if server.consecutive_failures >= self.failure_threshold:
                server.healthy = False
    
    def report_success(self, address: str):
        """Called when a query succeeds."""
        if address in self.servers:
            server = self.servers[address]
            server.consecutive_failures = 0
            server.healthy = True
    
    def _health_check(self, server: ServerHealth):
        """Send a simple DNS query to check server health."""
        try:
            # Query for root (.) with type NS - every DNS server should answer
            response = self._send_probe(server.address)
            if response:
                server.healthy = True
                server.consecutive_failures = 0
            else:
                raise Exception("No response")
        except Exception:
            server.consecutive_failures += 1
            if server.consecutive_failures >= self.failure_threshold:
                server.healthy = False
        
        server.last_check = time.time()
    
    def _start_health_checker(self):
        def check_loop():
            while True:
                for server in self.servers.values():
                    self._health_check(server)
                time.sleep(self.check_interval)
        
        thread = Thread(target=check_loop, daemon=True)
        thread.start()

Security Considerations

DNS was designed in a more trusting era. The original protocol has no authentication—any response claiming to be authoritative is trusted.

Cache poisoning exploits this. An attacker races to send a forged response before the legitimate one arrives. If successful, the resolver caches the attacker’s IP address for the domain.

DNSSEC adds cryptographic signatures to DNS responses. Each zone signs its records, and resolvers can verify the signature chain back to a trust anchor. It’s effective but complex to deploy and doesn’t encrypt queries.

DNS over HTTPS (DoH) and DNS over TLS (DoT) encrypt queries between clients and resolvers, preventing eavesdropping. They don’t help with the resolver-to-authoritative path, but they protect user privacy from network observers.

System Design Summary

Building a DNS system requires balancing several architectural decisions:

Hierarchy depth: The three-tier model (root → TLD → authoritative) works because it matches administrative boundaries. Deeper hierarchies add latency; shallower ones concentrate load.

Caching strategy: Aggressive caching with TTL-based expiration. Negative caching for NXDOMAIN. LRU eviction when memory-constrained.

Replication factor: Minimum two authoritative servers per zone, geographically distributed. More for high-traffic zones.

class DNSResolver:
    """Complete recursive resolver implementation."""
    
    def __init__(self, config: dict):
        self.cache = DNSCache(
            max_size=config.get("cache_size", 10000),
            negative_ttl=config.get("negative_ttl", 300)
        )
        self.root_pool = HealthCheckedServerPool(
            servers=config.get("root_servers", ROOT_SERVERS),
            check_interval=30
        )
        self.query_timeout = config.get("timeout", 2.0)
        self.max_recursion = config.get("max_recursion", 10)
    
    def query(self, name: str, record_type: RecordType) -> list[DNSRecord]:
        # Normalize input
        name = name.lower().rstrip(".")
        
        # Check cache
        cached = self.cache.get(name, record_type)
        if cached is not None:
            return cached
        
        # Resolve recursively
        try:
            records = self._recursive_resolve(name, record_type, depth=0)
            self.cache.put(name, record_type, records)
            return records
        except Exception as e:
            # Return SERVFAIL equivalent
            return []
    
    def _recursive_resolve(
        self, 
        name: str, 
        record_type: RecordType, 
        depth: int
    ) -> list[DNSRecord]:
        if depth > self.max_recursion:
            raise RecursionLimitExceeded()
        
        # Start from root
        nameservers = [self.root_pool.get_server()]
        
        # Walk the hierarchy
        while nameservers:
            server = nameservers[0]
            response = self._send_query(server, name, record_type)
            
            if response.authoritative:
                return response.answers
            
            if response.referrals:
                # Resolve referral NS addresses
                ns_ips = []
                for ns in response.referrals:
                    ips = self._recursive_resolve(
                        ns.value, RecordType.A, depth + 1
                    )
                    ns_ips.extend(r.value for r in ips)
                nameservers = ns_ips
            else:
                break
        
        return []

The beauty of DNS is that this hierarchical, cached, replicated design has scaled from a few thousand hosts to billions of devices over 40 years. The core architecture remains sound—we’ve just added more servers, better caching, and security layers on top.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.