Rust State Machine Pattern: Typestate Programming

Key Insights

Typestate programming uses Rust’s type system to encode state machines at compile-time, making invalid state transitions impossible rather than just catching them at runtime
The pattern relies on zero-sized type markers and consuming self in transition methods to enforce linear progression through states without runtime overhead
While powerful for API builders and protocol implementations, typestate adds complexity that’s only justified when preventing invalid states is critical to correctness

Introduction to Typestate Programming

Most developers model state machines using enums and runtime checks. You’ve probably written code like this:

enum TrafficLight {
    Red,
    Yellow,
    Green,
}

impl TrafficLight {
    fn next(&mut self) {
        *self = match self {
            TrafficLight::Red => TrafficLight::Green,
            TrafficLight::Green => TrafficLight::Yellow,
            TrafficLight::Yellow => TrafficLight::Red,
        }
    }
}

This works, but it pushes validation to runtime. If you accidentally allow Red → Yellow, you won’t know until the code executes.

Typestate programming takes a different approach: encode states as distinct types. Invalid transitions become compile errors:

struct Red;
struct Yellow;
struct Green;

struct TrafficLight<S> {
    _state: std::marker::PhantomData<S>,
}

impl TrafficLight<Red> {
    fn next(self) -> TrafficLight<Green> {
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Green> {
    fn next(self) -> TrafficLight<Yellow> {
        TrafficLight { _state: PhantomData }
    }
}

impl TrafficLight<Yellow> {
    fn next(self) -> TrafficLight<Red> {
        TrafficLight { _state: PhantomData }
    }
}

Now TrafficLight<Red> can only transition to TrafficLight<Green>. Try to go from red to yellow, and your code won’t compile. This is typestate: using types to represent states and the type system to enforce valid transitions.

Basic Typestate Implementation

The core mechanics are straightforward:

Define zero-sized types for each state
Make your main type generic over state
Consume self in transition methods to prevent reuse
Return a new instance in the target state

Here’s a TCP connection model:

use std::marker::PhantomData;

// State markers
struct Closed;
struct Listening;
struct Established;

struct TcpConnection<S> {
    address: String,
    _state: PhantomData<S>,
}

impl TcpConnection<Closed> {
    fn new(address: String) -> Self {
        TcpConnection {
            address,
            _state: PhantomData,
        }
    }

    fn listen(self) -> TcpConnection<Listening> {
        println!("Binding to {}", self.address);
        TcpConnection {
            address: self.address,
            _state: PhantomData,
        }
    }
}

impl TcpConnection<Listening> {
    fn accept(self) -> TcpConnection<Established> {
        println!("Accepting connection on {}", self.address);
        TcpConnection {
            address: self.address,
            _state: PhantomData,
        }
    }
}

impl TcpConnection<Established> {
    fn send(&self, data: &[u8]) {
        println!("Sending {} bytes", data.len());
    }

    fn close(self) -> TcpConnection<Closed> {
        println!("Closing connection");
        TcpConnection {
            address: self.address,
            _state: PhantomData,
        }
    }
}

Usage enforces the state machine:

let conn = TcpConnection::new("127.0.0.1:8080".to_string());
let conn = conn.listen();
let conn = conn.accept();
conn.send(b"Hello");

// This won't compile - can't send on a listening connection:
// let conn = TcpConnection::new("127.0.0.1:8080".to_string());
// let conn = conn.listen();
// conn.send(b"Hello"); // ERROR: method not found

The compiler prevents you from calling send() on anything except TcpConnection<Established>. No runtime checks needed.

Advanced Typestate Techniques

Real applications need state-specific data and behavior. Here’s a document workflow that demonstrates associated types and state-dependent fields:

struct Draft;
struct Review;
struct Published;

struct Document<S> {
    content: String,
    state_data: S,
}

struct DraftData {
    author: String,
    last_saved: std::time::SystemTime,
}

struct ReviewData {
    author: String,
    reviewer: String,
    comments: Vec<String>,
}

struct PublishedData {
    author: String,
    published_date: std::time::SystemTime,
    url: String,
}

impl Document<DraftData> {
    fn new(author: String, content: String) -> Self {
        Document {
            content,
            state_data: DraftData {
                author,
                last_saved: std::time::SystemTime::now(),
            },
        }
    }

    fn edit(&mut self, new_content: String) {
        self.content = new_content;
        self.state_data.last_saved = std::time::SystemTime::now();
    }

    fn submit_for_review(self, reviewer: String) -> Document<ReviewData> {
        Document {
            content: self.content,
            state_data: ReviewData {
                author: self.state_data.author,
                reviewer,
                comments: Vec::new(),
            },
        }
    }
}

impl Document<ReviewData> {
    fn add_comment(&mut self, comment: String) {
        self.state_data.comments.push(comment);
    }

    fn approve(self) -> Document<PublishedData> {
        Document {
            content: self.content,
            state_data: PublishedData {
                author: self.state_data.author,
                published_date: std::time::SystemTime::now(),
                url: format!("/docs/{}", uuid::Uuid::new_v4()),
            },
        }
    }

    fn reject(self) -> Document<DraftData> {
        Document {
            content: self.content,
            state_data: DraftData {
                author: self.state_data.author,
                last_saved: std::time::SystemTime::now(),
            },
        }
    }
}

impl Document<PublishedData> {
    fn get_url(&self) -> &str {
        &self.state_data.url
    }
}

Each state carries different data. You can’t call get_url() on a draft, and you can’t edit a published document. The type system enforces your business rules.

Handling State Data and Transitions

Fallible transitions are common in real systems. Here’s a file upload handler with validation:

struct Pending;
struct Validated;
struct Processing;
struct Complete;

struct FileUpload<S> {
    filename: String,
    data: Vec<u8>,
    state: S,
}

struct PendingState;

struct ValidatedState {
    mime_type: String,
    size: usize,
}

struct ProcessingState {
    mime_type: String,
    progress: f32,
}

struct CompleteState {
    storage_path: String,
    checksum: String,
}

impl FileUpload<PendingState> {
    fn new(filename: String, data: Vec<u8>) -> Self {
        FileUpload {
            filename,
            data,
            state: PendingState,
        }
    }

    fn validate(self) -> Result<FileUpload<ValidatedState>, String> {
        if self.data.is_empty() {
            return Err("File is empty".to_string());
        }

        let mime_type = if self.filename.ends_with(".txt") {
            "text/plain"
        } else if self.filename.ends_with(".jpg") {
            "image/jpeg"
        } else {
            return Err("Unsupported file type".to_string());
        };

        Ok(FileUpload {
            filename: self.filename,
            data: self.data,
            state: ValidatedState {
                mime_type: mime_type.to_string(),
                size: self.data.len(),
            },
        })
    }
}

impl FileUpload<ValidatedState> {
    fn start_processing(self) -> FileUpload<ProcessingState> {
        FileUpload {
            filename: self.filename,
            data: self.data,
            state: ProcessingState {
                mime_type: self.state.mime_type,
                progress: 0.0,
            },
        }
    }
}

impl FileUpload<ProcessingState> {
    fn complete(self, storage_path: String) -> FileUpload<CompleteState> {
        let checksum = format!("{:x}", md5::compute(&self.data));
        FileUpload {
            filename: self.filename,
            data: self.data,
            state: CompleteState {
                storage_path,
                checksum,
            },
        }
    }
}

impl FileUpload<CompleteState> {
    fn get_checksum(&self) -> &str {
        &self.state.checksum
    }
}

Notice how validate() returns a Result. Fallible transitions work naturally with typestate—you either get the new state or an error.

Real-World Application: API Client Builder

The builder pattern is where typestate truly shines. Here’s an HTTP client that requires authentication and a base URL before making requests:

struct NoAuth;
struct WithAuth;
struct NoBaseUrl;
struct WithBaseUrl;

struct ApiClient<A, B> {
    auth_token: Option<String>,
    base_url: Option<String>,
    timeout: u64,
    _auth: PhantomData<A>,
    _base: PhantomData<B>,
}

impl ApiClient<NoAuth, NoBaseUrl> {
    fn new() -> Self {
        ApiClient {
            auth_token: None,
            base_url: None,
            timeout: 30,
            _auth: PhantomData,
            _base: PhantomData,
        }
    }
}

impl<B> ApiClient<NoAuth, B> {
    fn with_auth(self, token: String) -> ApiClient<WithAuth, B> {
        ApiClient {
            auth_token: Some(token),
            base_url: self.base_url,
            timeout: self.timeout,
            _auth: PhantomData,
            _base: PhantomData,
        }
    }
}

impl<A> ApiClient<A, NoBaseUrl> {
    fn with_base_url(self, url: String) -> ApiClient<A, WithBaseUrl> {
        ApiClient {
            auth_token: self.auth_token,
            base_url: Some(url),
            timeout: self.timeout,
            _auth: PhantomData,
            _base: PhantomData,
        }
    }
}

impl<A, B> ApiClient<A, B> {
    fn with_timeout(mut self, timeout: u64) -> Self {
        self.timeout = timeout;
        self
    }
}

// Only fully configured clients can make requests
impl ApiClient<WithAuth, WithBaseUrl> {
    fn get(&self, path: &str) -> String {
        format!(
            "GET {}{} with auth token {}",
            self.base_url.as_ref().unwrap(),
            path,
            self.auth_token.as_ref().unwrap()
        )
    }
}

// Usage:
fn main() {
    let client = ApiClient::new()
        .with_auth("secret-token".to_string())
        .with_base_url("https://api.example.com".to_string())
        .with_timeout(60);

    let response = client.get("/users");

    // This won't compile - missing auth:
    // let client = ApiClient::new().with_base_url("https://api.example.com".to_string());
    // client.get("/users"); // ERROR
}

You literally cannot create a client that makes requests without proper configuration. This eliminates an entire class of runtime errors.

Trade-offs and Limitations

Typestate isn’t always the answer. The pattern has real costs:

Compilation time: Each state combination generates new monomorphized code. A type with three independent boolean states creates eight type combinations.

Ergonomics: Error messages can be cryptic. Users see ApiClient<NoAuth, WithBaseUrl> in errors, which requires documentation.

Complexity: Simple state machines don’t justify typestate overhead. Here’s when a basic enum is better:

// Don't use typestate for this:
enum ConnectionState {
    Idle,
    Connecting,
    Connected,
    Disconnected,
}

struct SimpleConnection {
    state: ConnectionState,
}

impl SimpleConnection {
    fn status(&self) -> &str {
        match self.state {
            ConnectionState::Idle => "idle",
            ConnectionState::Connecting => "connecting",
            ConnectionState::Connected => "connected",
            ConnectionState::Disconnected => "disconnected",
        }
    }
}

If you need to query state or support multiple valid transitions from any state, enums are simpler and more flexible.

Use typestate when:

Invalid states represent serious bugs or security issues
The API is complex enough that users will make mistakes
You’re building a library where compile-time guarantees add value
State transitions are linear or tree-like (not a complex graph)

Avoid typestate when:

States change frequently during development
You need runtime state inspection
The state machine has many states with complex transitions
Ergonomics matter more than correctness guarantees

Conclusion and Best Practices

Typestate programming transforms runtime invariants into compile-time guarantees. It’s not about being clever—it’s about making entire categories of bugs impossible.

Key patterns to remember:

Start simple: Begin with zero-sized state markers and PhantomData. Add complexity only when needed.

Consume self: Always take self by value in transitions to prevent use-after-transition bugs.

Use Result for fallible transitions: Don’t panic in state transitions. Return Result and let callers handle errors.

Document state requirements: Users need to understand what each state means and what transitions are valid.

Combine with other patterns: Typestate works well with builders, the newtype pattern, and sealed traits for restricting implementations.

The pattern shines in library APIs, protocol implementations, and anywhere that preventing misuse is more important than flexibility. When you find yourself writing runtime assertions about state validity, consider whether typestate could move those checks to compile-time.

Your users will thank you for catching their mistakes before the code runs.