Linux jq: Command-Line JSON Processing

If you're working with JSON data on the command line—and as a modern developer, you almost certainly are—jq is non-negotiable. This lightweight processor transforms JSON manipulation from a tedious...

Key Insights

  • jq is a command-line JSON processor that eliminates the need for writing custom parsing scripts, offering a domain-specific language optimized for JSON manipulation with pipe-based composition
  • The tool excels at filtering, transforming, and reshaping JSON data through a concise syntax that combines dot notation, array operations, and functional programming patterns
  • Real-world workflows benefit from jq’s ability to integrate seamlessly with curl, APIs, and shell scripts, making it indispensable for DevOps, API testing, and log analysis

Introduction to jq

If you’re working with JSON data on the command line—and as a modern developer, you almost certainly are—jq is non-negotiable. This lightweight processor transforms JSON manipulation from a tedious scripting exercise into elegant one-liners. Whether you’re parsing API responses, analyzing log files, or wrangling configuration data, jq provides a specialized language designed specifically for JSON operations.

Unlike general-purpose tools like awk or sed, jq understands JSON’s structure natively. It handles nested objects, arrays, and type conversions without the fragile string manipulation that plagues regex-based approaches. Install it via your package manager (apt install jq, brew install jq, or yum install jq) and you’re ready to process JSON with precision.

Here’s the simplest use case—pretty-printing JSON:

# Pretty-print JSON from a file
jq '.' data.json

# Pretty-print API response
curl -s https://api.github.com/users/torvalds | jq '.'

The . operator is jq’s identity filter—it outputs the input unchanged but formatted. This alone makes jq valuable for reading minified JSON, but we’re just getting started.

Basic JSON Querying and Filtering

jq’s syntax centers on filters that select and transform data. The dot operator accesses object properties, while brackets handle array indexing.

# Given this JSON in user.json:
# {"name": "Alice", "age": 30, "email": "alice@example.com"}

# Extract a single field
jq '.name' user.json
# Output: "Alice"

# Access nested properties
# Given: {"user": {"profile": {"city": "Seattle"}}}
jq '.user.profile.city' nested.json
# Output: "Seattle"

# Array indexing (zero-based)
# Given: {"items": ["apple", "banana", "cherry"]}
jq '.items[1]' array.json
# Output: "banana"

# Get array length
jq '.items | length' array.json
# Output: 3

The pipe operator (|) chains filters, passing output from one operation as input to the next. This functional composition is fundamental to jq’s power.

For arrays of objects, you can iterate and extract fields:

# Given: [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
jq '.[].name' users.json
# Output:
# "Alice"
# "Bob"

# Or collect into an array
jq '[.[].name]' users.json
# Output: ["Alice", "Bob"]

Filtering and Selecting Data

The select() function filters data based on conditions. This is where jq separates itself from simple field extraction.

# Filter users older than 28
jq '.[] | select(.age > 28)' users.json

# Multiple conditions with and/or
jq '.[] | select(.age > 25 and .active == true)' users.json

# Filter by string matching
jq '.[] | select(.email | contains("@company.com"))' users.json

# Exclude null values
jq '.[] | select(.phone != null)' users.json

You can combine selection with field extraction:

# Get names of active users only
jq '[.[] | select(.active == true) | .name]' users.json

# Filter and reshape
jq '.[] | select(.score > 80) | {name, score}' results.json

The {name, score} syntax creates a new object with only those fields—a shorthand for {name: .name, score: .score}.

Transforming and Mapping Data

The map() function applies a filter to each array element, returning a new array.

# Transform array elements
echo '[1,2,3,4,5]' | jq 'map(. * 2)'
# Output: [2,4,6,8,10]

# Extract specific fields from object arrays
jq 'map({name, email})' users.json

# Compute derived values
jq 'map({name, age, decade: (.age / 10 | floor)})' users.json

Reshaping JSON is common when interfacing between different APIs or systems:

# Original format: {"firstName": "Alice", "lastName": "Smith"}
# Target format: {"fullName": "Alice Smith", "display": "Smith, Alice"}

jq '{
  fullName: (.firstName + " " + .lastName),
  display: (.lastName + ", " + .firstName)
}' input.json

Array construction with conditions:

# Build array of specific values
jq '[.[] | select(.status == "active") | .id]' items.json

# Flatten nested arrays
jq '[.[] | .tags[]] | unique' posts.json

Advanced jq Techniques

For complex transformations, jq offers reduce, group_by, and custom functions.

# Group objects by a field
jq 'group_by(.department)' employees.json

# Group and count
jq 'group_by(.status) | map({status: .[0].status, count: length})' tickets.json

# Sum values with reduce
jq '[.[] | .amount] | add' transactions.json

# Or explicitly with reduce
jq 'reduce .[] as $item (0; . + $item.amount)' transactions.json

Variables make complex queries readable:

# Calculate percentage
jq '.[] | . as $item | {
  name: $item.name,
  percentage: (($item.score / $item.total) * 100)
}' results.json

Define custom functions for reusable logic:

# Define and use a function
jq 'def double: . * 2; map(double)' numbers.json

# Function with parameters
jq 'def multiply(n): . * n; map(multiply(3))' numbers.json

Real-World Use Cases

Processing GitHub API Responses:

# Get repository names and star counts
curl -s "https://api.github.com/users/microsoft/repos?per_page=5" | \
  jq '.[] | {name, stars: .stargazers_count, language}' | \
  jq -s 'sort_by(-.stars)'

Extracting Error Messages from Application Logs:

# Parse JSON logs and extract errors
cat app.log | \
  jq -r 'select(.level == "ERROR") | "\(.timestamp) - \(.message)"'

# Count errors by type
cat app.log | \
  jq -r 'select(.level == "ERROR") | .error_type' | \
  sort | uniq -c | sort -rn

Transforming Kubernetes JSON Output:

# Get pod names and statuses
kubectl get pods -o json | \
  jq '.items[] | {
    name: .metadata.name,
    status: .status.phase,
    restarts: .status.containerStatuses[0].restartCount
  }'

# Find pods with high restart counts
kubectl get pods -o json | \
  jq '.items[] | select(.status.containerStatuses[0].restartCount > 5) | .metadata.name'

CI/CD Pipeline Integration:

# Extract deployment version from package.json
VERSION=$(jq -r '.version' package.json)
echo "Deploying version $VERSION"

# Validate configuration
jq -e '.environment == "production"' config.json || {
  echo "Not production config"
  exit 1
}

Best Practices and Tips

Use raw output (-r) for shell integration: The -r flag outputs raw strings without JSON quotes, essential for piping to other commands.

# Wrong: outputs "value" with quotes
jq '.field' data.json

# Right: outputs value without quotes
jq -r '.field' data.json

Compact vs. pretty output: Use -c for compact (one line per object), useful for logging or further processing.

# Compact output for grep
jq -c '.[]' data.json | grep "searchterm"

Debugging complex queries: Build queries incrementally. Use --tab for readable output during development.

# Start simple
jq '.' data.json

# Add filtering
jq '.[]' data.json

# Add selection
jq '.[] | select(.active)' data.json

# Add transformation
jq '.[] | select(.active) | {name, email}' data.json

Combining with shell tools:

# Count unique values
jq -r '.[].status' data.json | sort | uniq -c

# Filter with grep before jq
cat large-file.json | grep "important" | jq '.'

# Process multiple files
for file in *.json; do
  echo "Processing $file"
  jq '.[] | select(.type == "critical")' "$file"
done

Performance considerations: For massive files, use streaming mode (--stream) or consider alternatives like jq with --slurp only when necessary, as it loads everything into memory.

# Efficient: processes line by line
cat huge.jsonl | jq -c 'select(.important)'

# Inefficient: loads entire file
jq '.[] | select(.important)' huge.json

Error handling in scripts:

# Exit on invalid JSON
jq -e '.' data.json || {
  echo "Invalid JSON"
  exit 1
}

# Provide default values
jq -r '.field // "default"' data.json

jq transforms JSON processing from a programming task into a command-line operation. Master its syntax, and you’ll handle API responses, configuration files, and log analysis with the same ease you use grep for text. The investment in learning jq’s functional approach pays dividends in every shell session.

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.