Go bufio: Buffered I/O Operations

Key Insights

Buffered I/O reduces system calls by batching read/write operations in memory, often improving performance by 10-100x for small operations
The bufio package provides three primary types: Reader for efficient reading, Writer for batched writes with explicit flushing, and Scanner for line/token-based text processing
Always flush buffered writers before closing and consider custom buffer sizes for specific workloads—the default 4KB buffer isn’t optimal for all scenarios

Introduction to Buffered I/O

Every system call has overhead. When you read or write data byte-by-byte or in small chunks, your program spends more time context-switching to the kernel than actually processing data. Buffered I/O solves this by maintaining an in-memory buffer that batches operations, drastically reducing the number of expensive system calls.

The bufio package wraps io.Reader and io.Writer interfaces with buffering capabilities. Instead of reading 10 bytes from disk 1000 times, you read 4KB once and serve those 10-byte requests from memory. The performance difference is substantial.

Here’s a concrete example comparing unbuffered and buffered file reading:

package main

import (
    "bufio"
    "fmt"
    "io"
    "os"
    "time"
)

func unbufferedRead(filename string) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    buf := make([]byte, 1)
    for {
        _, err := f.Read(buf)
        if err == io.EOF {
            break
        }
        if err != nil {
            return err
        }
    }
    return nil
}

func bufferedRead(filename string) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    reader := bufio.NewReader(f)
    buf := make([]byte, 1)
    for {
        _, err := reader.Read(buf)
        if err == io.EOF {
            break
        }
        if err != nil {
            return err
        }
    }
    return nil
}

func main() {
    filename := "largefile.txt" // 10MB test file

    start := time.Now()
    unbufferedRead(filename)
    fmt.Printf("Unbuffered: %v\n", time.Since(start))

    start = time.Now()
    bufferedRead(filename)
    fmt.Printf("Buffered: %v\n", time.Since(start))
}

On a typical system, the buffered version runs 50-100x faster. The unbuffered approach makes 10 million system calls; the buffered version makes around 2,500.

The bufio.Reader

bufio.Reader is your go-to for efficient reading operations. It provides multiple methods for different reading patterns, all backed by the same internal buffer.

Reading line-by-line is one of the most common patterns:

func readLines(filename string) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    reader := bufio.NewReader(f)
    for {
        line, err := reader.ReadString('\n')
        if err != nil && err != io.EOF {
            return err
        }
        
        // Process line (includes the \n delimiter)
        fmt.Print(line)
        
        if err == io.EOF {
            break
        }
    }
    return nil
}

For binary delimiters or when you need a byte slice instead of a string, use ReadBytes():

func readUntilDelimiter(r *bufio.Reader) ([]byte, error) {
    // Read until null byte
    data, err := r.ReadBytes(0x00)
    if err != nil {
        return nil, err
    }
    // data includes the delimiter
    return data[:len(data)-1], nil
}

The Peek() method lets you look ahead without consuming bytes—useful for protocol parsing:

func parseProtocol(r *bufio.Reader) error {
    // Check magic bytes without consuming them
    magic, err := r.Peek(4)
    if err != nil {
        return err
    }
    
    if string(magic) != "HTTP" {
        return fmt.Errorf("invalid protocol")
    }
    
    // Now read the actual data
    header := make([]byte, 4)
    _, err = r.Read(header)
    return err
}

For fixed-size reads, the standard Read() method works with the buffer:

func readChunks(r *bufio.Reader) error {
    chunk := make([]byte, 512)
    for {
        n, err := r.Read(chunk)
        if err == io.EOF {
            break
        }
        if err != nil {
            return err
        }
        processChunk(chunk[:n])
    }
    return nil
}

The bufio.Writer

bufio.Writer batches write operations in memory and flushes them in larger chunks. This is critical for performance when writing many small pieces of data.

The most important thing to understand about bufio.Writer: you must call Flush(). Buffered data won’t be written until the buffer fills or you explicitly flush.

func writeBuffered(filename string, lines []string) error {
    f, err := os.Create(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    writer := bufio.NewWriter(f)
    defer writer.Flush() // Critical: flush before file closes

    for _, line := range lines {
        _, err := writer.WriteString(line + "\n")
        if err != nil {
            return err
        }
    }
    
    return nil
}

Here’s a performance comparison showing buffer size impact:

func benchmarkBufferSizes() {
    data := []byte("x")
    iterations := 1000000

    // No buffering
    start := time.Now()
    f, _ := os.Create("test1.txt")
    for i := 0; i < iterations; i++ {
        f.Write(data)
    }
    f.Close()
    fmt.Printf("Unbuffered: %v\n", time.Since(start))

    // Default buffer (4KB)
    start = time.Now()
    f, _ = os.Create("test2.txt")
    w := bufio.NewWriter(f)
    for i := 0; i < iterations; i++ {
        w.Write(data)
    }
    w.Flush()
    f.Close()
    fmt.Printf("Buffered (4KB): %v\n", time.Since(start))

    // Large buffer (64KB)
    start = time.Now()
    f, _ = os.Create("test3.txt")
    w = bufio.NewWriterSize(f, 65536)
    for i := 0; i < iterations; i++ {
        w.Write(data)
    }
    w.Flush()
    f.Close()
    fmt.Printf("Buffered (64KB): %v\n", time.Since(start))
}

Typical results show unbuffered taking 3-5 seconds, 4KB buffer taking 50-100ms, and 64KB buffer taking 30-50ms.

bufio.Scanner for Text Processing

Scanner is purpose-built for tokenized text input. It handles the common pattern of splitting input by lines, words, or custom delimiters while managing the buffer automatically.

Reading log files line-by-line is cleaner with Scanner than Reader:

func processLogFile(filename string) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    scanner := bufio.NewScanner(f)
    lineNum := 0
    
    for scanner.Scan() {
        lineNum++
        line := scanner.Text() // No \n included
        
        if strings.Contains(line, "ERROR") {
            fmt.Printf("Line %d: %s\n", lineNum, line)
        }
    }
    
    return scanner.Err() // Check for errors
}

Custom split functions enable parsing structured data:

func scanCSV(data string) {
    scanner := bufio.NewScanner(strings.NewReader(data))
    
    // Custom split function for CSV fields
    scanner.Split(func(data []byte, atEOF bool) (advance int, token []byte, err error) {
        if atEOF && len(data) == 0 {
            return 0, nil, nil
        }
        
        if i := bytes.IndexByte(data, ','); i >= 0 {
            return i + 1, data[0:i], nil
        }
        
        if atEOF {
            return len(data), data, nil
        }
        
        return 0, nil, nil
    })
    
    for scanner.Scan() {
        fmt.Println(scanner.Text())
    }
}

Always check scanner.Err() after the loop—Scan() returns false for both EOF and errors:

scanner := bufio.NewScanner(file)
for scanner.Scan() {
    // Process scanner.Text()
}

if err := scanner.Err(); err != nil {
    log.Fatalf("Scanner error: %v", err)
}

Advanced Patterns and Performance Tuning

Default buffer sizes (4KB for Reader/Writer) work well for most cases, but you can optimize for specific workloads:

// Large buffer for high-throughput scenarios
largeReader := bufio.NewReaderSize(file, 128*1024) // 128KB

// Small buffer for memory-constrained environments
smallWriter := bufio.NewWriterSize(output, 1024) // 1KB

bufio.ReadWriter combines Reader and Writer for bidirectional communication:

func handleConnection(conn net.Conn) {
    defer conn.Close()
    
    rw := bufio.NewReadWriter(
        bufio.NewReader(conn),
        bufio.NewWriter(conn),
    )
    defer rw.Flush()
    
    // Read request
    request, err := rw.ReadString('\n')
    if err != nil {
        return
    }
    
    // Write response
    response := processRequest(request)
    rw.WriteString(response + "\n")
}

For network protocols, consider larger buffers to reduce system calls:

func createNetworkBuffer(conn net.Conn) *bufio.ReadWriter {
    return bufio.NewReadWriter(
        bufio.NewReaderSize(conn, 32*1024),
        bufio.NewWriterSize(conn, 32*1024),
    )
}

Common Pitfalls and Best Practices

The most common mistake is forgetting to flush. Use defer immediately after creating the writer:

func writeData(filename string) error {
    f, err := os.Create(filename)
    if err != nil {
        return err
    }
    defer f.Close()

    w := bufio.NewWriter(f)
    defer w.Flush() // Place this immediately after creation

    // Write operations...
    return nil
}

For large files, be mindful of memory usage. Scanner has a default token size limit of 64KB:

scanner := bufio.NewScanner(file)
buf := make([]byte, 0, 1024*1024) // 1MB initial capacity
scanner.Buffer(buf, 10*1024*1024) // 10MB max token size

Don’t use buffered I/O when you need immediate writes (like real-time logs) or when the underlying writer already buffers (like some network protocols):

// Bad: double buffering
conn, _ := net.Dial("tcp", "example.com:80")
buffered := bufio.NewWriter(conn) // TCP already buffers

// Good: use conn directly for immediate sends
conn.Write([]byte("GET / HTTP/1.1\r\n"))

Always handle errors properly and check return values:

n, err := writer.WriteString(data)
if err != nil {
    return fmt.Errorf("write failed: %w", err)
}
if n != len(data) {
    return fmt.Errorf("short write: %d of %d bytes", n, len(data))
}

The bufio package is fundamental to efficient I/O in Go. Master these patterns and you’ll write faster, more efficient programs while reducing system resource consumption.