Rust State Machine Pattern: Typestate Programming
Most developers model state machines using enums and runtime checks. You've probably written code like this:
Key Insights
- Typestate programming uses Rust’s type system to encode state machines at compile-time, making invalid state transitions impossible rather than just catching them at runtime
- The pattern relies on zero-sized type markers and consuming
selfin transition methods to enforce linear progression through states without runtime overhead - While powerful for API builders and protocol implementations, typestate adds complexity that’s only justified when preventing invalid states is critical to correctness
Introduction to Typestate Programming
Most developers model state machines using enums and runtime checks. You’ve probably written code like this:
enum TrafficLight {
Red,
Yellow,
Green,
}
impl TrafficLight {
fn next(&mut self) {
*self = match self {
TrafficLight::Red => TrafficLight::Green,
TrafficLight::Green => TrafficLight::Yellow,
TrafficLight::Yellow => TrafficLight::Red,
}
}
}
This works, but it pushes validation to runtime. If you accidentally allow Red → Yellow, you won’t know until the code executes.
Typestate programming takes a different approach: encode states as distinct types. Invalid transitions become compile errors:
struct Red;
struct Yellow;
struct Green;
struct TrafficLight<S> {
_state: std::marker::PhantomData<S>,
}
impl TrafficLight<Red> {
fn next(self) -> TrafficLight<Green> {
TrafficLight { _state: PhantomData }
}
}
impl TrafficLight<Green> {
fn next(self) -> TrafficLight<Yellow> {
TrafficLight { _state: PhantomData }
}
}
impl TrafficLight<Yellow> {
fn next(self) -> TrafficLight<Red> {
TrafficLight { _state: PhantomData }
}
}
Now TrafficLight<Red> can only transition to TrafficLight<Green>. Try to go from red to yellow, and your code won’t compile. This is typestate: using types to represent states and the type system to enforce valid transitions.
Basic Typestate Implementation
The core mechanics are straightforward:
- Define zero-sized types for each state
- Make your main type generic over state
- Consume
selfin transition methods to prevent reuse - Return a new instance in the target state
Here’s a TCP connection model:
use std::marker::PhantomData;
// State markers
struct Closed;
struct Listening;
struct Established;
struct TcpConnection<S> {
address: String,
_state: PhantomData<S>,
}
impl TcpConnection<Closed> {
fn new(address: String) -> Self {
TcpConnection {
address,
_state: PhantomData,
}
}
fn listen(self) -> TcpConnection<Listening> {
println!("Binding to {}", self.address);
TcpConnection {
address: self.address,
_state: PhantomData,
}
}
}
impl TcpConnection<Listening> {
fn accept(self) -> TcpConnection<Established> {
println!("Accepting connection on {}", self.address);
TcpConnection {
address: self.address,
_state: PhantomData,
}
}
}
impl TcpConnection<Established> {
fn send(&self, data: &[u8]) {
println!("Sending {} bytes", data.len());
}
fn close(self) -> TcpConnection<Closed> {
println!("Closing connection");
TcpConnection {
address: self.address,
_state: PhantomData,
}
}
}
Usage enforces the state machine:
let conn = TcpConnection::new("127.0.0.1:8080".to_string());
let conn = conn.listen();
let conn = conn.accept();
conn.send(b"Hello");
// This won't compile - can't send on a listening connection:
// let conn = TcpConnection::new("127.0.0.1:8080".to_string());
// let conn = conn.listen();
// conn.send(b"Hello"); // ERROR: method not found
The compiler prevents you from calling send() on anything except TcpConnection<Established>. No runtime checks needed.
Advanced Typestate Techniques
Real applications need state-specific data and behavior. Here’s a document workflow that demonstrates associated types and state-dependent fields:
struct Draft;
struct Review;
struct Published;
struct Document<S> {
content: String,
state_data: S,
}
struct DraftData {
author: String,
last_saved: std::time::SystemTime,
}
struct ReviewData {
author: String,
reviewer: String,
comments: Vec<String>,
}
struct PublishedData {
author: String,
published_date: std::time::SystemTime,
url: String,
}
impl Document<DraftData> {
fn new(author: String, content: String) -> Self {
Document {
content,
state_data: DraftData {
author,
last_saved: std::time::SystemTime::now(),
},
}
}
fn edit(&mut self, new_content: String) {
self.content = new_content;
self.state_data.last_saved = std::time::SystemTime::now();
}
fn submit_for_review(self, reviewer: String) -> Document<ReviewData> {
Document {
content: self.content,
state_data: ReviewData {
author: self.state_data.author,
reviewer,
comments: Vec::new(),
},
}
}
}
impl Document<ReviewData> {
fn add_comment(&mut self, comment: String) {
self.state_data.comments.push(comment);
}
fn approve(self) -> Document<PublishedData> {
Document {
content: self.content,
state_data: PublishedData {
author: self.state_data.author,
published_date: std::time::SystemTime::now(),
url: format!("/docs/{}", uuid::Uuid::new_v4()),
},
}
}
fn reject(self) -> Document<DraftData> {
Document {
content: self.content,
state_data: DraftData {
author: self.state_data.author,
last_saved: std::time::SystemTime::now(),
},
}
}
}
impl Document<PublishedData> {
fn get_url(&self) -> &str {
&self.state_data.url
}
}
Each state carries different data. You can’t call get_url() on a draft, and you can’t edit a published document. The type system enforces your business rules.
Handling State Data and Transitions
Fallible transitions are common in real systems. Here’s a file upload handler with validation:
struct Pending;
struct Validated;
struct Processing;
struct Complete;
struct FileUpload<S> {
filename: String,
data: Vec<u8>,
state: S,
}
struct PendingState;
struct ValidatedState {
mime_type: String,
size: usize,
}
struct ProcessingState {
mime_type: String,
progress: f32,
}
struct CompleteState {
storage_path: String,
checksum: String,
}
impl FileUpload<PendingState> {
fn new(filename: String, data: Vec<u8>) -> Self {
FileUpload {
filename,
data,
state: PendingState,
}
}
fn validate(self) -> Result<FileUpload<ValidatedState>, String> {
if self.data.is_empty() {
return Err("File is empty".to_string());
}
let mime_type = if self.filename.ends_with(".txt") {
"text/plain"
} else if self.filename.ends_with(".jpg") {
"image/jpeg"
} else {
return Err("Unsupported file type".to_string());
};
Ok(FileUpload {
filename: self.filename,
data: self.data,
state: ValidatedState {
mime_type: mime_type.to_string(),
size: self.data.len(),
},
})
}
}
impl FileUpload<ValidatedState> {
fn start_processing(self) -> FileUpload<ProcessingState> {
FileUpload {
filename: self.filename,
data: self.data,
state: ProcessingState {
mime_type: self.state.mime_type,
progress: 0.0,
},
}
}
}
impl FileUpload<ProcessingState> {
fn complete(self, storage_path: String) -> FileUpload<CompleteState> {
let checksum = format!("{:x}", md5::compute(&self.data));
FileUpload {
filename: self.filename,
data: self.data,
state: CompleteState {
storage_path,
checksum,
},
}
}
}
impl FileUpload<CompleteState> {
fn get_checksum(&self) -> &str {
&self.state.checksum
}
}
Notice how validate() returns a Result. Fallible transitions work naturally with typestate—you either get the new state or an error.
Real-World Application: API Client Builder
The builder pattern is where typestate truly shines. Here’s an HTTP client that requires authentication and a base URL before making requests:
struct NoAuth;
struct WithAuth;
struct NoBaseUrl;
struct WithBaseUrl;
struct ApiClient<A, B> {
auth_token: Option<String>,
base_url: Option<String>,
timeout: u64,
_auth: PhantomData<A>,
_base: PhantomData<B>,
}
impl ApiClient<NoAuth, NoBaseUrl> {
fn new() -> Self {
ApiClient {
auth_token: None,
base_url: None,
timeout: 30,
_auth: PhantomData,
_base: PhantomData,
}
}
}
impl<B> ApiClient<NoAuth, B> {
fn with_auth(self, token: String) -> ApiClient<WithAuth, B> {
ApiClient {
auth_token: Some(token),
base_url: self.base_url,
timeout: self.timeout,
_auth: PhantomData,
_base: PhantomData,
}
}
}
impl<A> ApiClient<A, NoBaseUrl> {
fn with_base_url(self, url: String) -> ApiClient<A, WithBaseUrl> {
ApiClient {
auth_token: self.auth_token,
base_url: Some(url),
timeout: self.timeout,
_auth: PhantomData,
_base: PhantomData,
}
}
}
impl<A, B> ApiClient<A, B> {
fn with_timeout(mut self, timeout: u64) -> Self {
self.timeout = timeout;
self
}
}
// Only fully configured clients can make requests
impl ApiClient<WithAuth, WithBaseUrl> {
fn get(&self, path: &str) -> String {
format!(
"GET {}{} with auth token {}",
self.base_url.as_ref().unwrap(),
path,
self.auth_token.as_ref().unwrap()
)
}
}
// Usage:
fn main() {
let client = ApiClient::new()
.with_auth("secret-token".to_string())
.with_base_url("https://api.example.com".to_string())
.with_timeout(60);
let response = client.get("/users");
// This won't compile - missing auth:
// let client = ApiClient::new().with_base_url("https://api.example.com".to_string());
// client.get("/users"); // ERROR
}
You literally cannot create a client that makes requests without proper configuration. This eliminates an entire class of runtime errors.
Trade-offs and Limitations
Typestate isn’t always the answer. The pattern has real costs:
Compilation time: Each state combination generates new monomorphized code. A type with three independent boolean states creates eight type combinations.
Ergonomics: Error messages can be cryptic. Users see ApiClient<NoAuth, WithBaseUrl> in errors, which requires documentation.
Complexity: Simple state machines don’t justify typestate overhead. Here’s when a basic enum is better:
// Don't use typestate for this:
enum ConnectionState {
Idle,
Connecting,
Connected,
Disconnected,
}
struct SimpleConnection {
state: ConnectionState,
}
impl SimpleConnection {
fn status(&self) -> &str {
match self.state {
ConnectionState::Idle => "idle",
ConnectionState::Connecting => "connecting",
ConnectionState::Connected => "connected",
ConnectionState::Disconnected => "disconnected",
}
}
}
If you need to query state or support multiple valid transitions from any state, enums are simpler and more flexible.
Use typestate when:
- Invalid states represent serious bugs or security issues
- The API is complex enough that users will make mistakes
- You’re building a library where compile-time guarantees add value
- State transitions are linear or tree-like (not a complex graph)
Avoid typestate when:
- States change frequently during development
- You need runtime state inspection
- The state machine has many states with complex transitions
- Ergonomics matter more than correctness guarantees
Conclusion and Best Practices
Typestate programming transforms runtime invariants into compile-time guarantees. It’s not about being clever—it’s about making entire categories of bugs impossible.
Key patterns to remember:
Start simple: Begin with zero-sized state markers and PhantomData. Add complexity only when needed.
Consume self: Always take self by value in transitions to prevent use-after-transition bugs.
Use Result for fallible transitions: Don’t panic in state transitions. Return Result and let callers handle errors.
Document state requirements: Users need to understand what each state means and what transitions are valid.
Combine with other patterns: Typestate works well with builders, the newtype pattern, and sealed traits for restricting implementations.
The pattern shines in library APIs, protocol implementations, and anywhere that preventing misuse is more important than flexibility. When you find yourself writing runtime assertions about state validity, consider whether typestate could move those checks to compile-time.
Your users will thank you for catching their mistakes before the code runs.