Rust PhantomData: Zero-Sized Type Markers
Rust's type system is strict about unused type parameters. If you declare a generic type parameter but don't actually use it in any fields, the compiler will reject your code. This creates a problem...
Key Insights
- PhantomData is a zero-sized type marker that enables compile-time type safety without any runtime overhead—it literally occupies zero bytes in memory
- Use PhantomData to hold unused type parameters, enforce state machine transitions at compile time, and control variance behavior in generic types
- The specific PhantomData pattern you choose (
PhantomData<T>,PhantomData<&'a T>,PhantomData<*const T>) determines variance and drop checker behavior for your type
The Problem PhantomData Solves
Rust’s type system is strict about unused type parameters. If you declare a generic type parameter but don’t actually use it in any fields, the compiler will reject your code. This creates a problem when you want to attach type information purely for compile-time checking without storing actual data of that type.
Consider this example that fails to compile:
struct Container<T> {
data: Vec<u8>,
}
// error[E0392]: parameter `T` is never used
// consider removing `T`, referring to it in a field, or using a marker such as `PhantomData`
The compiler is telling us that T serves no purpose—we’re not storing it, so why is it there? But what if we want T to serve as a type-level marker? Perhaps we’re building a type-safe API where Container<Json> behaves differently from Container<Xml> at compile time, even though both store the same underlying bytes.
This is exactly where PhantomData comes in.
What is PhantomData?
PhantomData<T> is a zero-sized type from the standard library that acts as a compile-time marker. It tells the compiler “this type logically owns or is associated with type T” without actually storing any data. The key property: it has absolutely no runtime cost.
Here’s how we fix our previous example:
use std::marker::PhantomData;
struct Container<T> {
data: Vec<u8>,
_marker: PhantomData<T>,
}
impl<T> Container<T> {
fn new(data: Vec<u8>) -> Self {
Container {
data,
_marker: PhantomData,
}
}
}
// Prove it's zero-sized
fn main() {
println!("Size of Container<u32>: {}", std::mem::size_of::<Container<u32>>());
println!("Size of Vec<u8>: {}", std::mem::size_of::<Vec<u8>>());
// Both print the same size—PhantomData adds nothing!
}
The _marker field (prefixed with underscore to indicate it’s intentionally unused) satisfies the compiler’s requirement that T appears in the struct, but adds zero bytes to the struct’s size. We get type-level information for free.
Common Use Cases
PhantomData shines in several practical scenarios.
Type-State Builders
One of the most powerful patterns is using PhantomData to enforce state transitions at compile time. Here’s a configuration builder that ensures you can’t build without setting required fields:
use std::marker::PhantomData;
struct Unset;
struct Set;
struct ConfigBuilder<HostState, PortState> {
host: Option<String>,
port: Option<u16>,
_host_state: PhantomData<HostState>,
_port_state: PhantomData<PortState>,
}
impl ConfigBuilder<Unset, Unset> {
fn new() -> Self {
ConfigBuilder {
host: None,
port: None,
_host_state: PhantomData,
_port_state: PhantomData,
}
}
}
impl<PortState> ConfigBuilder<Unset, PortState> {
fn host(self, host: String) -> ConfigBuilder<Set, PortState> {
ConfigBuilder {
host: Some(host),
port: self.port,
_host_state: PhantomData,
_port_state: PhantomData,
}
}
}
impl<HostState> ConfigBuilder<HostState, Unset> {
fn port(self, port: u16) -> ConfigBuilder<HostState, Set> {
ConfigBuilder {
host: self.host,
port: Some(port),
_host_state: PhantomData,
_port_state: PhantomData,
}
}
}
impl ConfigBuilder<Set, Set> {
fn build(self) -> Config {
Config {
host: self.host.unwrap(),
port: self.port.unwrap(),
}
}
}
struct Config {
host: String,
port: u16,
}
fn main() {
let config = ConfigBuilder::new()
.host("localhost".to_string())
.port(8080)
.build();
// This won't compile—build() only exists when both are Set:
// let bad_config = ConfigBuilder::new().build();
}
The type system enforces that build() is only available when both host and port have been set. No runtime checks needed.
Ownership Markers for Raw Pointers
When working with raw pointers in unsafe code, PhantomData helps express ownership semantics:
use std::marker::PhantomData;
use std::ptr::NonNull;
struct UniquePtr<T> {
ptr: NonNull<T>,
_marker: PhantomData<T>, // Acts like we own a T
}
impl<T> UniquePtr<T> {
unsafe fn new(ptr: *mut T) -> Option<Self> {
NonNull::new(ptr).map(|ptr| UniquePtr {
ptr,
_marker: PhantomData,
})
}
}
impl<T> Drop for UniquePtr<T> {
fn drop(&mut self) {
unsafe {
// PhantomData<T> tells drop checker we may drop T
std::ptr::drop_in_place(self.ptr.as_ptr());
}
}
}
Without PhantomData<T>, the compiler wouldn’t know that dropping UniquePtr<T> might drop a T, which could lead to unsound behavior with lifetimes.
Variance and Drop Check
PhantomData’s type parameter pattern determines how your type behaves with respect to variance—whether it’s covariant, contravariant, or invariant. This matters when dealing with subtyping and lifetimes.
Different patterns have different meanings:
PhantomData<T>: Covariant overT(acts like owning aT)PhantomData<&'a T>: Covariant over'aandT(acts like owning a&'a T)PhantomData<&'a mut T>: Covariant over'a, invariant overTPhantomData<*const T>: Covariant overT(acts like owning a*const T)PhantomData<fn(T)>: Contravariant overT(rarely needed)
Here’s why this matters:
use std::marker::PhantomData;
struct Covariant<'a, T> {
_marker: PhantomData<&'a T>,
}
struct Invariant<'a, T> {
_marker: PhantomData<&'a mut T>,
}
fn covariant_example() {
let long_lived = String::from("long");
let cov: Covariant<'static, &str> = Covariant { _marker: PhantomData };
// This works: 'static can be used where a shorter lifetime is expected
let _shorter: Covariant<'_, &str> = cov;
}
fn invariant_example() {
let long_lived = String::from("long");
let inv: Invariant<'static, &str> = Invariant { _marker: PhantomData };
// This would fail: invariant types can't be substituted
// let _shorter: Invariant<'_, &str> = inv;
}
For most use cases, PhantomData<T> is the right choice. Use PhantomData<&'a T> when you need to tie a lifetime to your type without actually storing a reference.
Real-World Example: Type-Safe State Machine
Let’s build a complete connection state machine where invalid state transitions are caught at compile time:
use std::marker::PhantomData;
// State types
struct Disconnected;
struct Connecting;
struct Connected;
struct Connection<State> {
address: String,
_state: PhantomData<State>,
}
impl Connection<Disconnected> {
fn new(address: String) -> Self {
println!("Created disconnected connection to {}", address);
Connection {
address,
_state: PhantomData,
}
}
fn connect(self) -> Connection<Connecting> {
println!("Initiating connection to {}", self.address);
Connection {
address: self.address,
_state: PhantomData,
}
}
}
impl Connection<Connecting> {
fn complete(self) -> Connection<Connected> {
println!("Connection to {} established", self.address);
Connection {
address: self.address,
_state: PhantomData,
}
}
fn fail(self) -> Connection<Disconnected> {
println!("Connection to {} failed", self.address);
Connection {
address: self.address,
_state: PhantomData,
}
}
}
impl Connection<Connected> {
fn send(&self, data: &str) {
println!("Sending '{}' to {}", data, self.address);
}
fn disconnect(self) -> Connection<Disconnected> {
println!("Disconnecting from {}", self.address);
Connection {
address: self.address,
_state: PhantomData,
}
}
}
fn main() {
let conn = Connection::new("127.0.0.1:8080".to_string());
let conn = conn.connect();
let conn = conn.complete();
conn.send("Hello");
// This won't compile—can't send on a disconnected connection:
// let bad_conn = Connection::new("localhost".to_string());
// bad_conn.send("This won't work");
// This won't compile—can't skip the connecting state:
// let bad_conn = Connection::new("localhost".to_string());
// let bad_conn = bad_conn.complete();
}
This pattern eliminates entire classes of runtime errors. You literally cannot write code that sends data on a disconnected connection—it won’t compile.
Common Pitfalls and Best Practices
Choose the right PhantomData pattern. For most cases, PhantomData<T> works fine. Only use more specific patterns like PhantomData<&'a T> when you need precise variance control.
Don’t overuse type-state patterns. While compile-time state machines are powerful, they can make APIs harder to use. Reserve them for cases where invalid states would cause serious bugs or safety issues.
Remember it’s truly zero-cost. Don’t hesitate to use PhantomData when it improves type safety. Here’s a before-and-after showing the transition from runtime to compile-time checking:
// Runtime checking—error prone
struct RuntimeContainer {
data: Vec<u8>,
format: String, // "json" or "xml"
}
impl RuntimeContainer {
fn parse_json(&self) -> Result<(), String> {
if self.format != "json" {
return Err("Not a JSON container".to_string());
}
// parse logic
Ok(())
}
}
// Compile-time checking—impossible to misuse
struct Json;
struct Xml;
struct CompiletimeContainer<T> {
data: Vec<u8>,
_format: PhantomData<T>,
}
impl CompiletimeContainer<Json> {
fn parse_json(&self) {
// Always safe—this method only exists for JSON containers
}
}
impl CompiletimeContainer<Xml> {
fn parse_xml(&self) {
// Always safe—this method only exists for XML containers
}
}
The compile-time version is impossible to misuse and has zero runtime overhead compared to the runtime-checked version.
Conclusion
PhantomData is one of Rust’s most elegant features—a zero-sized type that enables sophisticated compile-time guarantees without any runtime cost. Whether you’re building type-safe APIs, implementing state machines, or working with unsafe code, PhantomData gives you the tools to push more invariants into the type system.
The next time you find yourself writing runtime checks for state or format validity, ask whether PhantomData could move those checks to compile time. Your future self—and your users—will thank you for catching bugs before the code even runs.