Go atomic Package: Lock-Free Operations

Key Insights

Atomic operations provide lock-free synchronization with significantly lower overhead than mutexes for simple read-modify-write operations, but only work for primitive types and require consistent usage across all access points
The sync/atomic package guarantees atomicity and establishes happens-before relationships between goroutines, making it suitable for counters, flags, and configuration updates without explicit locking
CompareAndSwap (CAS) enables building sophisticated lock-free data structures, but increased complexity and subtle memory ordering concerns mean mutexes remain the better choice for most concurrent programming scenarios

Introduction to Lock-Free Programming

Concurrent programming in Go typically involves protecting shared data with mutexes. While effective, mutexes introduce overhead: goroutines block waiting for locks, the scheduler gets involved, and context switches occur. For simple operations like incrementing a counter or setting a flag, this overhead is overkill.

Lock-free programming using atomic operations provides an alternative. Instead of acquiring locks, atomic operations guarantee that read-modify-write sequences complete without interruption, even when multiple goroutines access the same memory location simultaneously.

Consider a simple counter accessed by multiple goroutines:

// Mutex-based approach
type MutexCounter struct {
    mu    sync.Mutex
    value int64
}

func (c *MutexCounter) Increment() {
    c.mu.Lock()
    c.value++
    c.mu.Unlock()
}

func (c *MutexCounter) Value() int64 {
    c.mu.Lock()
    defer c.mu.Unlock()
    return c.value
}

// Atomic approach
type AtomicCounter struct {
    value int64
}

func (c *AtomicCounter) Increment() {
    atomic.AddInt64(&c.value, 1)
}

func (c *AtomicCounter) Value() int64 {
    return atomic.LoadInt64(&c.value)
}

The atomic version is simpler, faster, and doesn’t block. However, it only works for specific types and operations. Understanding when and how to use atomics is crucial for writing efficient concurrent Go code.

Understanding the atomic Package

The sync/atomic package provides low-level atomic memory primitives. These operations are implemented using CPU-specific instructions that guarantee atomicity at the hardware level. The package works with integer types (int32, int64, uint32, uint64, uintptr) and unsafe.Pointer.

The fundamental guarantee is simple: atomic operations appear to execute instantaneously from the perspective of other goroutines. No goroutine can observe a partially completed atomic operation.

Here’s a practical example with multiple goroutines incrementing a shared counter:

package main

import (
    "fmt"
    "sync"
    "sync/atomic"
)

func main() {
    var counter int64
    var wg sync.WaitGroup
    
    // Launch 100 goroutines, each incrementing 1000 times
    for i := 0; i < 100; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for j := 0; j < 1000; j++ {
                atomic.AddInt64(&counter, 1)
            }
        }()
    }
    
    wg.Wait()
    fmt.Printf("Final count: %d\n", atomic.LoadInt64(&counter))
    // Output: Final count: 100000
}

Without atomic operations, this code would have race conditions and produce incorrect results. The atomic operations ensure all increments are properly accounted for.

Core Atomic Operations

The atomic package provides five fundamental operation types:

Add performs atomic addition (or subtraction with negative values):

atomic.AddInt64(&counter, 1)   // increment
atomic.AddInt64(&counter, -1)  // decrement
atomic.AddInt64(&counter, 10)  // add 10

Load reads a value atomically:

value := atomic.LoadInt64(&counter)

Store writes a value atomically:

atomic.StoreInt64(&counter, 42)

Swap atomically stores a new value and returns the old value:

oldValue := atomic.SwapInt64(&counter, 100)

CompareAndSwap (CAS) is the most powerful operation. It atomically compares a value with an expected value and, if they match, stores a new value:

swapped := atomic.CompareAndSwapInt64(&counter, 10, 20)
// If counter was 10, it's now 20 and swapped is true
// Otherwise, counter is unchanged and swapped is false

CAS enables building lock-free data structures. Here’s a simple lock-free stack:

type LockFreeStack struct {
    head unsafe.Pointer // *node
}

type node struct {
    value interface{}
    next  unsafe.Pointer // *node
}

func (s *LockFreeStack) Push(value interface{}) {
    newNode := &node{value: value}
    for {
        oldHead := atomic.LoadPointer(&s.head)
        newNode.next = oldHead
        if atomic.CompareAndSwapPointer(&s.head, oldHead, unsafe.Pointer(newNode)) {
            return
        }
        // CAS failed, retry
    }
}

func (s *LockFreeStack) Pop() (interface{}, bool) {
    for {
        oldHead := atomic.LoadPointer(&s.head)
        if oldHead == nil {
            return nil, false
        }
        oldNode := (*node)(oldHead)
        newHead := atomic.LoadPointer(&oldNode.next)
        if atomic.CompareAndSwapPointer(&s.head, oldHead, newHead) {
            return oldNode.value, true
        }
        // CAS failed, retry
    }
}

Atomic swap is useful for hot-reloading configuration:

type Config struct {
    maxConnections int
    timeout        time.Duration
}

var configPtr unsafe.Pointer // *Config

func UpdateConfig(newConfig *Config) *Config {
    return (*Config)(atomic.SwapPointer(&configPtr, unsafe.Pointer(newConfig)))
}

func GetConfig() *Config {
    return (*Config)(atomic.LoadPointer(&configPtr))
}

The atomic.Value Type

Working with unsafe.Pointer is error-prone. The atomic.Value type provides a safer way to store arbitrary types atomically:

type Config struct {
    MaxConnections int
    Timeout        time.Duration
    EnableLogging  bool
}

type ConfigCache struct {
    value atomic.Value // stores *Config
}

func (c *ConfigCache) Store(config *Config) {
    c.value.Store(config)
}

func (c *ConfigCache) Load() *Config {
    v := c.value.Load()
    if v == nil {
        return nil
    }
    return v.(*Config)
}

// Usage
func main() {
    cache := &ConfigCache{}
    
    // Store initial config
    cache.Store(&Config{
        MaxConnections: 100,
        Timeout:        30 * time.Second,
        EnableLogging:  true,
    })
    
    // Multiple goroutines can safely read
    go func() {
        config := cache.Load()
        fmt.Printf("Max connections: %d\n", config.MaxConnections)
    }()
    
    // Hot reload
    cache.Store(&Config{
        MaxConnections: 200,
        Timeout:        60 * time.Second,
        EnableLogging:  false,
    })
}

Important: atomic.Value requires type consistency. Once you store a value of a particular type, all subsequent stores must use the same type. Storing different types panics.

Memory Ordering and Happens-Before Guarantees

Atomic operations in Go provide happens-before guarantees. A happens-before relationship ensures that memory writes by one goroutine are visible to reads by another goroutine.

Specifically, an atomic write happens-before any atomic read that observes that write. This makes atomics suitable for synchronization:

type Message struct {
    data  string
    ready int32 // atomic flag
}

// Producer
func produce(msg *Message) {
    msg.data = "important data"
    atomic.StoreInt32(&msg.ready, 1) // Everything before this is visible after
}

// Consumer
func consume(msg *Message) {
    for atomic.LoadInt32(&msg.ready) == 0 {
        runtime.Gosched() // yield to other goroutines
    }
    // After seeing ready == 1, msg.data is guaranteed visible
    fmt.Println(msg.data)
}

The atomic store of ready synchronizes with the atomic load, establishing a happens-before relationship. All writes before the store (including msg.data) are visible after the load returns 1.

Performance Considerations and Benchmarks

Atomic operations are fast, but not always faster than mutexes. Under high contention, both degrade, but differently:

func BenchmarkAtomicIncrement(b *testing.B) {
    var counter int64
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            atomic.AddInt64(&counter, 1)
        }
    })
}

func BenchmarkMutexIncrement(b *testing.B) {
    var mu sync.Mutex
    var counter int64
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            mu.Lock()
            counter++
            mu.Unlock()
        }
    })
}

On my machine with 8 cores:

Low contention (1 goroutine): atomic ~0.3ns, mutex ~15ns
High contention (8 goroutines): atomic ~20ns, mutex ~80ns

Atomics are consistently faster for simple operations. However, mutexes are better when:

You need to protect multiple related values
Critical sections contain complex logic
Code clarity matters more than nanoseconds

Common Patterns and Pitfalls

Pattern: Graceful shutdown flag

type Server struct {
    shutdown int32
}

func (s *Server) Shutdown() {
    atomic.StoreInt32(&s.shutdown, 1)
}

func (s *Server) IsShutdown() bool {
    return atomic.LoadInt32(&s.shutdown) == 1
}

func (s *Server) Run() {
    for !s.IsShutdown() {
        // process requests
    }
}

Pitfall: Inconsistent access

type Counter struct {
    value int64
}

func (c *Counter) Increment() {
    atomic.AddInt64(&c.value, 1) // atomic
}

func (c *Counter) Value() int64 {
    return c.value // BUG: non-atomic read!
}

This is a race condition. Once you use atomic operations on a variable, all accesses must be atomic. Use atomic.LoadInt64(&c.value) instead.

Pitfall: Type mismatch with atomic.Value

var v atomic.Value
v.Store("string")
v.Store(42) // PANIC: different type

Atomic operations are powerful tools for specific scenarios. Use them for simple synchronization primitives like counters, flags, and single-value caches. For anything more complex, reach for mutexes or channels. The best concurrent code is correct first, fast second.