Linux yq: Command-Line YAML Processing
If you've worked with JSON on the command line, you've likely used jq. For YAML files, yq fills the same role—a lightweight, powerful processor for querying and manipulating structured data without...
Key Insights
- yq brings jq-style processing to YAML files, making it essential for managing Kubernetes manifests, CI/CD configs, and infrastructure-as-code
- The tool supports in-place editing, multi-document processing, and complex transformations without writing custom scripts
- Modern yq (v4+) uses a completely different syntax than v3—always check your version and use
yq evalfor v4 commands
Introduction to yq
If you’ve worked with JSON on the command line, you’ve likely used jq. For YAML files, yq fills the same role—a lightweight, powerful processor for querying and manipulating structured data without opening an editor or writing Python scripts.
YAML dominates modern infrastructure tooling. Kubernetes manifests, Docker Compose files, Ansible playbooks, GitHub Actions workflows, and Helm charts all use YAML. Being able to programmatically read and modify these files is crucial for automation, CI/CD pipelines, and configuration management at scale.
Installation varies by distribution, but here’s what you need:
# Ubuntu/Debian
sudo wget -qO /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64
sudo chmod +x /usr/local/bin/yq
# Using snap
sudo snap install yq
# macOS
brew install yq
# Verify installation and check version
yq --version
Critical note: There are multiple tools named yq. This article covers mikefarah/yq (v4+), the most actively maintained and feature-rich option. The Python-based yq and older v3 syntax are incompatible with examples here.
Reading and Querying YAML Files
Let’s start with a sample Kubernetes deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
labels:
app: web
spec:
replicas: 3
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
- name: sidecar
image: busybox:latest
Extract specific values using dot notation:
# Get deployment name
yq eval '.metadata.name' deployment.yaml
# Output: web-app
# Get namespace
yq eval '.metadata.namespace' deployment.yaml
# Output: production
# Access nested values
yq eval '.spec.replicas' deployment.yaml
# Output: 3
Array indexing works with square brackets:
# Get first container name
yq eval '.spec.containers[0].name' deployment.yaml
# Output: nginx
# Get second container image
yq eval '.spec.containers[1].image' deployment.yaml
# Output: busybox:latest
# Get all container names (returns array)
yq eval '.spec.containers[].name' deployment.yaml
# Output:
# nginx
# sidecar
Filter arrays with select conditions:
# Find containers using nginx image
yq eval '.spec.containers[] | select(.image == "nginx:1.21")' deployment.yaml
# Get names of containers exposing port 80
yq eval '.spec.containers[] | select(.ports[].containerPort == 80) | .name' deployment.yaml
The pipe operator | chains operations, similar to Unix pipes. This becomes powerful for complex queries.
Modifying YAML Content
Reading data is useful, but yq shines when modifying configurations programmatically.
Update existing values:
# Change replica count
yq eval '.spec.replicas = 5' deployment.yaml
# Update container image
yq eval '.spec.containers[0].image = "nginx:1.22"' deployment.yaml
# Update with in-place editing (-i flag)
yq eval -i '.spec.replicas = 5' deployment.yaml
The -i flag modifies files directly. Without it, yq prints to stdout, letting you preview changes or redirect to new files.
Add new fields:
# Add a new label
yq eval '.metadata.labels.environment = "prod"' deployment.yaml
# Add annotations object if it doesn't exist
yq eval '.metadata.annotations.deployed-by = "automation"' deployment.yaml
# Add resource limits to first container
yq eval '.spec.containers[0].resources.limits.memory = "512Mi"' deployment.yaml
Delete keys with the del() function:
# Remove a specific label
yq eval 'del(.metadata.labels.app)' deployment.yaml
# Remove entire annotations section
yq eval 'del(.metadata.annotations)' deployment.yaml
# Remove second container from array
yq eval 'del(.spec.containers[1])' deployment.yaml
Working with Multiple YAML Documents
Many YAML files contain multiple documents separated by ---. Kubernetes manifests often bundle related resources this way.
---
apiVersion: v1
kind: Service
metadata:
name: web-service
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
Process all documents:
# Get all resource kinds
yq eval '.kind' multi-resource.yaml
# Output:
# Service
# Deployment
# Select specific document (0-indexed)
yq eval 'select(document_index == 0)' multi-resource.yaml
Filter documents by content:
# Get only Deployment resources
yq eval 'select(.kind == "Deployment")' multi-resource.yaml
# Extract names of all Services
yq eval 'select(.kind == "Service") | .metadata.name' multi-resource.yaml
Merge multiple YAML files:
# Combine files (maintains document separators)
yq eval-all '.' file1.yaml file2.yaml > combined.yaml
# Merge into single document
yq eval-all 'select(fileIndex == 0) * select(fileIndex == 1)' base.yaml override.yaml
Advanced Operations
Merge objects using the * operator:
# Merge labels from two sources
yq eval '.metadata.labels = .metadata.labels * {"new-label": "value"}' deployment.yaml
Conditional updates with select:
# Update only containers named "nginx"
yq eval '(.spec.containers[] | select(.name == "nginx") | .image) = "nginx:1.23"' deployment.yaml
# Add resource limits only to containers without them
yq eval '.spec.containers[] |= (select(.resources == null) | .resources.limits.cpu = "100m")' deployment.yaml
Map operations over arrays:
# Add a prefix to all container names
yq eval '.spec.containers[].name |= "prod-" + .' deployment.yaml
# Convert all image tags to use digest format (example transformation)
yq eval '.spec.containers[].image |= . + "@sha256:abc123"' deployment.yaml
String interpolation and variables:
# Use environment variables
export NEW_TAG="v2.0.1"
yq eval '.spec.containers[0].image = "myapp:" + env(NEW_TAG)' deployment.yaml
# Use internal variables
yq eval '.version as $v | .metadata.labels.version = $v' config.yaml
Practical Real-World Scenarios
Update Kubernetes deployment image tags (common in CI/CD):
#!/bin/bash
DEPLOYMENT="deployment.yaml"
NEW_IMAGE="myregistry.io/app:${CI_COMMIT_SHA}"
yq eval -i "(.spec.template.spec.containers[] | select(.name == \"app\") | .image) = \"${NEW_IMAGE}\"" $DEPLOYMENT
Extract all container images from a namespace export:
# Useful for security audits
yq eval '.items[].spec.template.spec.containers[].image' namespace-export.yaml | sort -u
Modify Helm values files:
# Update multiple values in values.yaml
yq eval -i '
.image.tag = "v2.1.0" |
.replicaCount = 5 |
.ingress.enabled = true
' values.yaml
Validate and format YAML files:
# Pretty-print with consistent indentation
yq eval '.' messy.yaml > formatted.yaml
# Validate YAML syntax (exits with error if invalid)
yq eval '.' config.yaml > /dev/null && echo "Valid YAML"
# Convert YAML to JSON
yq eval -o=json '.' config.yaml > config.json
Bulk update across multiple files:
# Update image tag in all deployments
find k8s/ -name "*.yaml" -exec yq eval -i '
(select(.kind == "Deployment") | .spec.template.spec.containers[0].image) |=
sub(":[^:]+$", ":v2.0.0")
' {} \;
Tips and Best Practices
Chain operations efficiently:
# Multiple updates in one command
yq eval -i '
.metadata.labels.version = "2.0" |
.spec.replicas = 3 |
del(.metadata.annotations.old-annotation)
' deployment.yaml
Use // for default values:
# Set replicas to 3 if not specified
yq eval '.spec.replicas //= 3' deployment.yaml
Output formatting options:
# JSON output
yq eval -o=json '.' file.yaml
# Compact output (no colors, minimal formatting)
yq eval -o=yaml -P '.' file.yaml
# Custom indentation
yq eval --indent 4 '.' file.yaml
Test expressions before in-place edits:
# Always preview first
yq eval '.spec.replicas = 10' deployment.yaml
# Then apply if correct
yq eval -i '.spec.replicas = 10' deployment.yaml
Handle errors gracefully in scripts:
#!/bin/bash
set -e # Exit on error
if ! yq eval '.metadata.name' deployment.yaml > /dev/null 2>&1; then
echo "Error: Invalid YAML or missing .metadata.name"
exit 1
fi
Use comments for complex expressions:
# While yq doesn't support inline comments in expressions,
# document your scripts externally or use heredocs:
yq eval '
# Update image tag
.spec.containers[0].image = "nginx:1.23" |
# Increase replicas
.spec.replicas = 5
' deployment.yaml
Master yq and you’ll handle YAML manipulation faster than opening an editor. It’s indispensable for GitOps workflows, automated deployments, and infrastructure automation. The learning curve is worth it—you’ll use these patterns daily in modern DevOps environments.