Linux Process Substitution: <() and >()

Key Insights

Process substitution treats command output as a file using <(command) for reading and >(command) for writing, eliminating the need for temporary files in multi-input scenarios
Unlike pipes which connect stdout to stdin linearly, process substitution creates named file descriptors that let you pass multiple command outputs to programs expecting file arguments
The shell creates /dev/fd/ file descriptors or named pipes under the hood, making this feature bash/zsh-specific and unavailable in POSIX sh scripts

What Process Substitution Actually Does

Process substitution is one of those shell features that seems esoteric until you need it—then it becomes indispensable. At its core, process substitution allows you to use command output where a filename is expected. The shell runs your command in the background and presents it as a file descriptor that other commands can read from or write to.

Let’s compare three different approaches to working with command output:

# Pipe: Linear connection, stdout → stdin
ls /tmp | grep "log"

# Command substitution: Captures output as a string
files=$(ls /tmp)
echo $files

# Process substitution: Treats output as a file
diff <(ls /tmp) <(ls /var/tmp)

Pipes work great for linear data flow, and command substitution excels at capturing output into variables. But what if you need to compare two directory listings with diff? The diff command expects two file arguments, not piped input. You could create temporary files, but that’s clunky:

# The old way: temporary files
ls /tmp > /tmp/list1
ls /var/tmp > /tmp/list2
diff /tmp/list1 /tmp/list2
rm /tmp/list1 /tmp/list2

# The elegant way: process substitution
diff <(ls /tmp) <(ls /var/tmp)

Process substitution eliminates the temporary file dance entirely.

Reading from Processes with <()

The <(command) syntax creates a file descriptor that reads from the command’s output. When the shell encounters this, it starts the command asynchronously and substitutes a path like /dev/fd/63 in its place.

Here’s a practical example comparing sorted user lists from two systems:

# Compare users on two different servers
diff <(ssh server1 'cut -d: -f1 /etc/passwd | sort') \
     <(ssh server2 'cut -d: -f1 /etc/passwd | sort')

This runs both SSH commands in parallel and feeds their outputs to diff as if they were files. No temporary files, no sequential execution—just clean, efficient comparison.

Process substitution shines with commands that need multiple file inputs. The join command, which merges sorted files on a common field, is a perfect candidate:

# Join data from two different sources
join <(sort users.txt) <(sort permissions.txt)

# Compare package installations across systems
comm <(ssh prod 'rpm -qa | sort') <(ssh staging 'rpm -qa | sort')

You can also use it with while read loops when you need to process command output line by line while maintaining the current shell’s scope:

# This preserves variables in the current shell
while IFS= read -r line; do
    ((count++))
    echo "Line $count: $line"
done < <(find /var/log -name "*.log")
echo "Total files: $count"  # This works!

# Compare with a pipe (creates a subshell)
find /var/log -name "*.log" | while IFS= read -r line; do
    ((count++))
done
echo "Total files: $count"  # This shows 0!

The difference is crucial: pipes create subshells, so variable assignments don’t persist. Process substitution with redirection keeps everything in the current shell.

Writing to Processes with >()

The >(command) syntax creates a file descriptor that writes to the command’s input. This is incredibly useful for splitting output streams or processing data in parallel.

Here’s how to send output to multiple destinations simultaneously:

# Write to both a compressed file and a processing pipeline
some_command > >(gzip > output.gz) > >(grep "ERROR" > errors.log)

# Monitor and log simultaneously
tail -f /var/log/app.log | tee >(grep "ERROR" > errors.log) \
                               >(grep "WARN" > warnings.log)

A common pattern is compressing output on the fly without temporary files:

# Backup a directory and compress in one step
tar cf - /important/data | tee >(sha256sum > backup.sha256) \
                               | gzip > backup.tar.gz

# Download and process simultaneously
curl https://api.example.com/data | tee >(jq '.errors' > errors.json) \
                                        >(jq '.warnings' > warnings.json) \
                                        | jq '.results' > results.json

This approach processes data once but routes it to multiple destinations, each potentially transforming it differently.

Real-World Applications

Process substitution excels in scenarios where you’re comparing, joining, or processing multiple data streams. Here are patterns I use regularly:

Comparing configuration files across environments:

# Check for configuration drift
diff <(ssh prod 'cat /etc/app/config.yml | grep -v "^#" | sort') \
     <(ssh staging 'cat /etc/app/config.yml | grep -v "^#" | sort')

Combining outputs side-by-side with paste:

# Show CPU and memory usage side by side
paste <(ps aux | awk '{print $3}' | tail -n +2) \
      <(ps aux | awk '{print $4}' | tail -n +2) \
      | awk '{printf "CPU: %s%% MEM: %s%%\n", $1, $2}'

Processing logs from multiple sources:

# Analyze error rates across multiple log files
grep "ERROR" <(tail -n 1000 /var/log/app1.log) \
             <(tail -n 1000 /var/log/app2.log) \
             <(ssh remote 'tail -n 1000 /var/log/app3.log') \
    | awk '{print $1}' | sort | uniq -c

Creating complex data pipelines:

# Compare active connections to expected services
comm -23 <(netstat -tuln | awk '{print $4}' | sort | uniq) \
         <(cat expected_ports.txt | sort)

Under the Hood: How It Actually Works

When you use process substitution, the shell creates either a named pipe (FIFO) or uses /dev/fd/ file descriptors, depending on your system. You can see what gets created:

# This shows the actual file descriptor path
echo <(ls)
# Output: /dev/fd/63

# You can see it's not a regular file
ls -l <(echo "test")
# Output: lr-x------ 1 user user 64 Nov 15 10:30 /dev/fd/63 -> pipe:[12345]

The shell spawns the command in a subshell, creates the file descriptor, and substitutes the path before executing the main command. The subshell’s output connects to the file descriptor, which the main command reads from (or writes to).

This is bash and zsh specific. POSIX sh doesn’t support process substitution, which matters for portable scripts:

#!/bin/bash
# This works in bash/zsh
diff <(ls dir1) <(ls dir2)

#!/bin/sh
# This will fail in POSIX sh
diff <(ls dir1) <(ls dir2)  # Syntax error

Pitfalls and Limitations

Process substitution has limitations you need to understand. The most important: the file descriptor isn’t seekable. Commands that need to read a file multiple times or seek to specific positions won’t work:

# This fails - wc needs to seek
wc <(cat large_file.txt) <(cat large_file.txt)

# Use temporary files instead
cat large_file.txt > /tmp/temp
wc /tmp/temp /tmp/temp

Timing and scope issues can bite you. The substituted process runs asynchronously, which can cause race conditions:

# The process might not finish before the next command
result=<(long_running_command)
# Using $result immediately might read incomplete data

Error handling is trickier because the substituted process runs in a subshell. If it fails, you might not notice:

# If the grep finds nothing, diff still runs
diff <(grep "pattern" file1.txt) <(grep "pattern" file2.txt)

# Better: check explicitly
if ! grep -q "pattern" file1.txt || ! grep -q "pattern" file2.txt; then
    echo "Pattern not found in one or both files"
    exit 1
fi
diff <(grep "pattern" file1.txt) <(grep "pattern" file2.txt)

Some commands don’t handle /dev/fd/ paths well, particularly older utilities or those that check file types. When in doubt, test or fall back to temporary files.

When to Use Process Substitution

Use process substitution when you need to:

Pass multiple command outputs to a program expecting file arguments (diff, join, comm, paste)
Avoid temporary files for intermediate data
Process data streams in parallel
Maintain variable scope with while read loops

Avoid it when you need:

POSIX shell compatibility
Seekable files or multiple passes over data
Simple linear pipelines (use regular pipes instead)
Maximum portability across systems

The syntax is simple: <(command) creates a readable file descriptor, >(command) creates a writable one. Master these, and you’ll write cleaner, more efficient shell scripts that avoid the temporary file antipattern.