Fluentd: Log Collection and Forwarding

In distributed systems, logs scatter across dozens or hundreds of services, containers, and hosts. Without centralized collection, debugging production issues becomes archaeological work—SSH-ing into...

Key Insights

  • Fluentd creates a unified logging layer that decouples log sources from destinations, eliminating point-to-point integration complexity in distributed systems
  • The tag-based routing system and plugin architecture enable powerful log transformation pipelines without writing custom code
  • Production deployments require careful buffer tuning and aggregator/forwarder patterns to handle high throughput while maintaining reliability

Introduction to Fluentd and the Unified Logging Layer

In distributed systems, logs scatter across dozens or hundreds of services, containers, and hosts. Without centralized collection, debugging production issues becomes archaeological work—SSH-ing into servers, grepping files, correlating timestamps manually. Fluentd solves this by implementing a unified logging layer that sits between your applications and log storage systems.

The unified logging layer concept means applications don’t care where logs go. They write to stdout, files, or send to a local Fluentd agent. Fluentd handles collection, parsing, enrichment, and routing to multiple destinations simultaneously. Change your log storage from Elasticsearch to S3? Update Fluentd config, not application code.

Fluentd’s real power comes from its plugin ecosystem. Over 500 plugins cover virtually every input source and output destination. This flexibility makes it the de facto standard for Kubernetes logging and a core component of the EFK (Elasticsearch, Fluentd, Kibana) stack.

Core Architecture and Data Pipeline

Fluentd processes logs through a five-stage pipeline: Input → Parser → Filter → Buffer → Output. Each stage transforms data incrementally, keeping the architecture clean and debuggable.

Events flow through the system tagged with labels like app.nginx or system.syslog. Tags determine routing—different tags can go to different outputs, or the same tag can fan out to multiple destinations. This tag-based routing eliminates complex conditional logic.

Here’s a basic configuration showing the pipeline structure:

# Input: Tail application logs
<source>
  @type tail
  path /var/log/app/*.log
  pos_file /var/log/fluentd/app.pos
  tag app.backend
  <parse>
    @type json
  </parse>
</source>

# Filter: Add metadata
<filter app.**>
  @type record_transformer
  <record>
    hostname ${hostname}
    environment production
  </record>
</filter>

# Output: Send to Elasticsearch
<match app.**>
  @type elasticsearch
  host elasticsearch.local
  port 9200
  index_name fluentd
  type_name log
</match>

This configuration tails JSON logs, adds hostname and environment tags, then sends everything to Elasticsearch. The ** wildcard in filters and matches handles all tags starting with app.

Input Plugins and Log Collection

Fluentd supports diverse input sources. The tail plugin monitors log files like tail -f, maintaining position files to survive restarts. The forward plugin receives logs from other Fluentd instances, creating aggregation hierarchies. HTTP inputs let applications POST logs directly.

Tailing application log files:

<source>
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/fluentd/nginx.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

Receiving logs from Docker containers:

<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>

Configure Docker to use the Fluentd logging driver:

docker run --log-driver=fluentd \
  --log-opt fluentd-address=localhost:24224 \
  --log-opt tag=docker.{{.Name}} \
  myapp

HTTP endpoint for application logging:

<source>
  @type http
  port 8888
  bind 0.0.0.0
  body_size_limit 32m
  keepalive_timeout 10s
  <parse>
    @type json
  </parse>
</source>

Applications can now POST JSON logs:

curl -X POST -d 'json={"event":"user_login","user_id":12345}' \
  http://localhost:8888/app.events

Filters and Data Transformation

Filters transform events in-flight. The parser filter extracts structured data from unstructured logs. The record_transformer adds, removes, or modifies fields. The grep filter includes or excludes events based on patterns.

Parsing complex log formats:

<filter app.backend>
  @type parser
  key_name message
  <parse>
    @type regexp
    expression /^\[(?<timestamp>[^\]]+)\] (?<level>\w+): (?<message>.*)$/
    time_key timestamp
    time_format %Y-%m-%d %H:%M:%S
  </parse>
</filter>

Adding metadata and computed fields:

<filter app.**>
  @type record_transformer
  enable_ruby true
  <record>
    hostname "#{Socket.gethostname}"
    environment "#{ENV['APP_ENV']}"
    timestamp ${time.strftime('%Y-%m-%dT%H:%M:%S%z')}
    log_size ${record.to_json.bytesize}
  </record>
</filter>

Filtering sensitive information:

<filter app.**>
  @type grep
  <exclude>
    key message
    pattern /(password|credit_card|ssn)=/i
  </exclude>
</filter>

<filter app.**>
  @type record_modifier
  remove_keys password, api_key, secret
</filter>

Output Plugins and Destinations

Output plugins send processed logs to storage systems. Buffer configuration at the output level controls reliability versus performance tradeoffs.

Elasticsearch with daily indices:

<match app.**>
  @type elasticsearch
  host elasticsearch.local
  port 9200
  index_name app-logs-%Y.%m.%d
  type_name _doc
  
  <buffer tag, time>
    @type file
    path /var/log/fluentd/buffer/elasticsearch
    timekey 3600
    timekey_wait 10m
    flush_mode interval
    flush_interval 30s
    retry_max_interval 30
    retry_forever true
  </buffer>
</match>

S3 with time-sliced partitioning:

<match app.**>
  @type s3
  aws_key_id YOUR_AWS_KEY
  aws_sec_key YOUR_AWS_SECRET
  s3_bucket my-app-logs
  s3_region us-east-1
  path logs/%Y/%m/%d/
  
  <buffer tag, time>
    @type file
    path /var/log/fluentd/buffer/s3
    timekey 3600
    timekey_wait 10m
    chunk_limit_size 256m
  </buffer>
  
  <format>
    @type json
  </format>
</match>

Multiple outputs for redundancy:

<match app.**>
  @type copy
  
  <store>
    @type elasticsearch
    host es-primary.local
    port 9200
  </store>
  
  <store>
    @type s3
    s3_bucket backup-logs
    s3_region us-east-1
  </store>
  
  <store>
    @type kafka2
    brokers kafka1:9092,kafka2:9092
    topic_key topic
    default_topic app-logs
  </store>
</match>

Production Deployment Patterns

Production Fluentd deployments typically use an aggregator/forwarder architecture. Lightweight forwarders run on each host, collecting logs locally with minimal resource usage. They forward to aggregator instances that handle heavy processing, filtering, and output to storage.

This pattern provides several benefits: forwarders restart quickly without losing data, aggregators can scale independently, and you centralize expensive operations like parsing and enrichment.

Docker Compose aggregator setup:

version: '3'
services:
  fluentd-aggregator:
    image: fluent/fluentd:v1.16-1
    ports:
      - "24224:24224"
      - "24224:24224/udp"
    volumes:
      - ./fluentd/aggregator.conf:/fluentd/etc/fluent.conf
      - ./fluentd/buffer:/var/log/fluentd/buffer
    environment:
      - FLUENTD_CONF=fluent.conf
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 2G
        reservations:
          memory: 1G

  fluentd-forwarder:
    image: fluent/fluentd:v1.16-1
    volumes:
      - ./fluentd/forwarder.conf:/fluentd/etc/fluent.conf
      - /var/log:/var/log:ro
    environment:
      - FLUENTD_CONF=fluent.conf
      - AGGREGATOR_HOST=fluentd-aggregator
    depends_on:
      - fluentd-aggregator

Forwarder configuration:

<source>
  @type tail
  path /var/log/app/*.log
  pos_file /var/log/fluentd/app.pos
  tag app.logs
  <parse>
    @type json
  </parse>
</source>

<match **>
  @type forward
  <server>
    host fluentd-aggregator
    port 24224
  </server>
  <buffer>
    @type file
    path /var/log/fluentd/buffer/forward
    flush_interval 5s
    retry_max_interval 30
  </buffer>
</match>

Performance Tuning and Best Practices

Fluentd performance depends primarily on buffer configuration and worker threads. Buffers absorb traffic spikes and enable batch processing. File-based buffers survive restarts but add I/O overhead. Memory buffers are faster but risk data loss.

High-throughput configuration:

<system>
  workers 4
  root_dir /var/log/fluentd
</system>

<source>
  @type forward
  port 24224
</source>

<match app.**>
  @type elasticsearch
  host elasticsearch.local
  port 9200
  
  <buffer tag>
    @type file
    path /var/log/fluentd/buffer/es
    
    # Buffer sizing
    chunk_limit_size 8MB
    queue_limit_length 512
    
    # Flush behavior
    flush_mode interval
    flush_interval 10s
    flush_thread_count 4
    
    # Retry behavior
    retry_type exponential_backoff
    retry_timeout 72h
    retry_max_interval 30
    retry_forever false
    
    # Overflow behavior
    overflow_action drop_oldest_chunk
  </buffer>
</match>

Key tuning parameters:

  • workers: Multiply throughput by running parallel pipelines. Set to number of CPU cores.
  • chunk_limit_size: Larger chunks reduce overhead but increase memory usage. 8-16MB works well.
  • queue_limit_length: How many chunks to buffer. Calculate based on expected traffic and acceptable data loss window.
  • flush_thread_count: Parallel output threads. Match to downstream system’s concurrency limits.

Common pitfalls:

  • Forgetting pos_file in tail inputs causes re-processing logs after restart
  • Insufficient buffer space causes data loss under load
  • Not monitoring Fluentd’s own metrics (buffer queue length, retry count)
  • Complex regex parsing in high-volume pipelines kills performance—parse at aggregator, not forwarder

Monitor Fluentd using its built-in metrics:

<source>
  @type monitor_agent
  bind 0.0.0.0
  port 24220
</source>

Access metrics at http://localhost:24220/api/plugins.json to track buffer usage, retry counts, and throughput.

Fluentd transforms log chaos into structured, routable data streams. Master its pipeline architecture and buffer tuning, and you’ll build logging infrastructure that scales from dozens to thousands of hosts.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.