Rust Procedural Macros: Custom Derive

Key Insights

Procedural macros operate on Rust’s token stream at compile time, enabling code generation that eliminates boilerplate while maintaining type safety
The syn and quote crates form the foundation of procedural macro development—syn parses input tokens into structured AST nodes, while quote generates new Rust code through template-like syntax
Custom derive macros require a separate crate with proc-macro = true, but the investment pays off when you need to implement the same trait pattern across dozens of types

Understanding Procedural Macros

Rust offers two macro systems: declarative macros (defined with macro_rules!) and procedural macros. Declarative macros work through pattern matching, while procedural macros are functions that consume and produce token streams. This fundamental difference gives procedural macros significantly more power.

Procedural macros come in three flavors: derive macros, attribute macros, and function-like macros. Derive macros are the most common because they solve a specific pain point: implementing traits for types with similar patterns. Instead of writing repetitive trait implementations, you annotate your type with #[derive(YourTrait)] and let the macro generate the code.

Consider implementing a simple Describe trait manually:

trait Describe {
    fn describe(&self) -> String;
}

struct User {
    name: String,
    age: u32,
}

impl Describe for User {
    fn describe(&self) -> String {
        format!("User {{ name: {}, age: {} }}", self.name, self.age)
    }
}

With a custom derive macro, this becomes:

#[derive(Describe)]
struct User {
    name: String,
    age: u32,
}

The macro generates identical code, but you’ve eliminated the boilerplate.

Project Structure and Dependencies

Procedural macros must live in a separate crate with proc-macro = true in Cargo.toml. This requirement exists because procedural macros run during compilation of other crates—they’re essentially compiler plugins.

Here’s the typical setup using a Cargo workspace:

# Workspace Cargo.toml
[workspace]
members = ["my_macro", "my_macro_test"]

# my_macro/Cargo.toml
[package]
name = "my_macro"
version = "0.1.0"
edition = "2021"

[lib]
proc-macro = true

[dependencies]
syn = { version = "2.0", features = ["full"] }
quote = "1.0"
proc-macro2 = "1.0"

# my_macro_test/Cargo.toml
[package]
name = "my_macro_test"
version = "0.1.0"
edition = "2021"

[dependencies]
my_macro = { path = "../my_macro" }

The syn crate parses Rust syntax into data structures you can manipulate. The quote crate does the reverse—it converts Rust code (written in a template format) back into tokens. The proc-macro2 crate provides a wrapper around the compiler’s procedural macro API that works in both procedural macros and regular code, making testing easier.

Parsing Input with `syn`

When your derive macro is invoked, it receives a TokenStream representing the annotated item. The syn crate transforms this stream into a DeriveInput structure:

use proc_macro::TokenStream;
use syn::{parse_macro_input, DeriveInput, Data, Fields};

#[proc_macro_derive(Builder)]
pub fn derive_builder(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    
    // Access the struct name
    let name = &input.ident;
    
    // Access the fields
    let fields = match &input.data {
        Data::Struct(data) => {
            match &data.fields {
                Fields::Named(fields) => &fields.named,
                _ => panic!("Builder only supports named fields"),
            }
        }
        _ => panic!("Builder only supports structs"),
    };
    
    // Extract field information
    for field in fields {
        let field_name = &field.ident;
        let field_type = &field.ty;
        println!("Field: {:?}, Type: {:?}", field_name, field_type);
    }
    
    TokenStream::new()
}

The DeriveInput struct contains everything about the annotated item: its name, generics, attributes, and most importantly, its data (struct fields, enum variants, or union fields). For structs with named fields, you iterate through fields.named to access each field’s identifier and type.

Code Generation with `quote`

The quote! macro provides a clean syntax for generating Rust code. Variables are interpolated with #, and you can repeat patterns with #(...)*:

use quote::quote;

#[proc_macro_derive(Describe)]
pub fn derive_describe(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;
    
    let fields = match &input.data {
        Data::Struct(data) => match &data.fields {
            Fields::Named(fields) => &fields.named,
            _ => panic!("Only named fields supported"),
        },
        _ => panic!("Only structs supported"),
    };
    
    let field_descriptions = fields.iter().map(|f| {
        let field_name = &f.ident;
        quote! {
            format!("{}: {:?}", stringify!(#field_name), self.#field_name)
        }
    });
    
    let expanded = quote! {
        impl Describe for #name {
            fn describe(&self) -> String {
                vec![#(#field_descriptions),*].join(", ")
            }
        }
    };
    
    TokenStream::from(expanded)
}

The #(#field_descriptions),* syntax repeats the expression for each field, inserting commas between them. This generates code like vec![format!(...), format!(...), format!(...)].join(", ").

Complete Example: Builder Pattern

Let’s implement a practical #[derive(Builder)] macro that generates the builder pattern:

use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, DeriveInput, Data, Fields, Type};

#[proc_macro_derive(Builder)]
pub fn derive_builder(input: TokenStream) -> TokenStream {
    let input = parse_macro_input!(input as DeriveInput);
    let name = &input.ident;
    let builder_name = quote::format_ident!("{}Builder", name);
    
    let fields = match &input.data {
        Data::Struct(data) => match &data.fields {
            Fields::Named(fields) => &fields.named,
            _ => panic!("Builder only works with named fields"),
        },
        _ => panic!("Builder only works with structs"),
    };
    
    // Generate builder struct fields (all Option<T>)
    let builder_fields = fields.iter().map(|f| {
        let name = &f.ident;
        let ty = &f.ty;
        quote! {
            #name: Option<#ty>
        }
    });
    
    // Generate setter methods
    let setters = fields.iter().map(|f| {
        let name = &f.ident;
        let ty = &f.ty;
        quote! {
            pub fn #name(mut self, #name: #ty) -> Self {
                self.#name = Some(#name);
                self
            }
        }
    });
    
    // Generate build method field initialization
    let build_fields = fields.iter().map(|f| {
        let name = &f.ident;
        quote! {
            #name: self.#name.ok_or(concat!("Field '", stringify!(#name), "' not set"))?
        }
    });
    
    let expanded = quote! {
        pub struct #builder_name {
            #(#builder_fields),*
        }
        
        impl #builder_name {
            #(#setters)*
            
            pub fn build(self) -> Result<#name, &'static str> {
                Ok(#name {
                    #(#build_fields),*
                })
            }
        }
        
        impl #name {
            pub fn builder() -> #builder_name {
                #builder_name {
                    #(#name: None),*
                }
            }
        }
    };
    
    TokenStream::from(expanded)
}

Usage:

#[derive(Builder)]
struct Config {
    host: String,
    port: u16,
    timeout: u64,
}

fn main() {
    let config = Config::builder()
        .host("localhost".to_string())
        .port(8080)
        .timeout(30)
        .build()
        .unwrap();
}

Testing and Debugging

Use cargo expand to see generated code. Install it with cargo install cargo-expand, then run:

cargo expand --lib

For testing, create integration tests in your test crate:

#[test]
fn test_builder() {
    #[derive(Builder)]
    struct Person {
        name: String,
        age: u32,
    }
    
    let person = Person::builder()
        .name("Alice".to_string())
        .age(30)
        .build()
        .unwrap();
    
    assert_eq!(person.name, "Alice");
    assert_eq!(person.age, 30);
}

#[test]
fn test_builder_missing_field() {
    #[derive(Builder)]
    struct Person {
        name: String,
        age: u32,
    }
    
    let result = Person::builder()
        .name("Bob".to_string())
        .build();
    
    assert!(result.is_err());
}

Best Practices

Always provide helpful error messages using syn::Error:

use syn::{Error, spanned::Spanned};

let fields = match &input.data {
    Data::Struct(data) => match &data.fields {
        Fields::Named(fields) => &fields.named,
        Fields::Unnamed(_) => {
            return Error::new(
                input.span(),
                "Builder requires named fields (use struct { field: Type } not struct(Type))"
            ).to_compile_error().into();
        }
        Fields::Unit => {
            return Error::new(
                input.span(),
                "Builder cannot be derived for unit structs"
            ).to_compile_error().into();
        }
    },
    Data::Enum(data) => {
        return Error::new(
            data.enum_token.span,
            "Builder cannot be derived for enums"
        ).to_compile_error().into();
    }
    Data::Union(data) => {
        return Error::new(
            data.union_token.span,
            "Builder cannot be derived for unions"
        ).to_compile_error().into();
    }
};

Support generics by including them in generated code:

let generics = &input.generics;
let (impl_generics, ty_generics, where_clause) = generics.split_for_impl();

let expanded = quote! {
    impl #impl_generics Describe for #name #ty_generics #where_clause {
        // implementation
    }
};

Use derive macros when you’re implementing the same trait pattern across many types. For one-off implementations or when you need runtime flexibility, regular trait implementations or attribute macros may be more appropriate. The upfront complexity of procedural macros pays dividends at scale.