Bash Arrays: Indexed and Associative Arrays
Arrays in Bash transform how you handle collections of data in shell scripts. Without arrays, managing multiple related values means juggling individual variables or parsing delimited strings—both...
Key Insights
- Bash supports indexed arrays (integer keys) in all versions and associative arrays (string keys) starting from Bash 4.0, each serving distinct use cases in shell scripting
- Indexed arrays use zero-based indexing and can be sparse, while associative arrays require explicit declaration with
declare -Aand function like hash maps - Proper quoting with
"${array[@]}"preserves individual elements during iteration and prevents word splitting, a critical detail most beginners overlook
Introduction to Bash Arrays
Arrays in Bash transform how you handle collections of data in shell scripts. Without arrays, managing multiple related values means juggling individual variables or parsing delimited strings—both error-prone approaches that don’t scale. Arrays provide structured storage that makes your scripts more maintainable and less fragile.
Bash offers two array types: indexed arrays (available in all Bash versions) and associative arrays (Bash 4.0+). Indexed arrays use integer keys starting from zero, perfect for ordered lists and sequential data. Associative arrays use string keys, functioning as hash maps for configuration data, lookup tables, and any scenario requiring named access.
Here’s a quick comparison:
# Indexed array - integer keys
servers=(web1 web2 web3)
echo "${servers[0]}" # web1
# Associative array - string keys
declare -A config
config[host]="localhost"
config[port]="8080"
echo "${config[host]}" # localhost
Check your Bash version with bash --version. If you’re below 4.0, you’re limited to indexed arrays.
Indexed Arrays: Basics and Operations
Indexed arrays offer multiple declaration methods. Choose based on your initialization needs:
# Method 1: Explicit declaration (optional but clear)
declare -a fruits
# Method 2: Direct assignment
colors[0]="red"
colors[1]="blue"
# Method 3: Parentheses syntax (most common)
numbers=(10 20 30 40 50)
# Method 4: Command substitution
files=($(ls *.txt))
# Method 5: Reading from input
readarray -t lines < file.txt
The parentheses syntax is cleanest for hardcoded values. Command substitution works but beware of word splitting issues with filenames containing spaces—use readarray or mapfile instead for file operations.
Accessing elements requires proper syntax:
servers=(web1 web2 web3 db1 db2)
# Single element (zero-indexed)
echo "${servers[0]}" # web1
echo "${servers[4]}" # db2
# All elements - two forms
echo "${servers[@]}" # Expands to separate words
echo "${servers[*]}" # Expands to single string
# Array length
echo "${#servers[@]}" # 5
# Length of specific element
echo "${#servers[0]}" # 4 (length of "web1")
# Last element (Bash 4.3+)
echo "${servers[-1]}" # db2
Always quote "${array[@]}" to preserve elements as separate words. Without quotes, elements with spaces break into multiple arguments.
Sparse arrays are valid in Bash—indices don’t need to be contiguous:
sparse[0]="first"
sparse[10]="eleventh"
sparse[100]="hundred-first"
echo "${#sparse[@]}" # 3 (number of elements, not highest index)
Appending elements uses the += operator:
logs=("error.log" "access.log")
logs+=("debug.log")
logs+=("audit.log" "system.log")
echo "${logs[@]}" # error.log access.log debug.log audit.log system.log
Associative Arrays: Key-Value Storage
Associative arrays require explicit declaration with declare -A. Forgetting this declaration causes Bash to treat your array as indexed, leading to confusing bugs:
# WRONG - creates indexed array
config[host]="localhost" # Creates config[0]="localhost"
# CORRECT - creates associative array
declare -A config
config[host]="localhost"
config[port]="8080"
config[debug]="true"
Multiple initialization approaches exist:
# Method 1: Individual assignments
declare -A database
database[host]="db.example.com"
database[port]="5432"
database[name]="production"
# Method 2: Inline initialization
declare -A cache=([ttl]="3600" [size]="1024" [enabled]="yes")
# Method 3: Dynamic population
declare -A env_vars
while IFS='=' read -r key value; do
env_vars[$key]="$value"
done < config.env
Accessing and manipulating associative arrays:
declare -A user_roles=(
[alice]="admin"
[bob]="developer"
[charlie]="viewer"
)
# Access by key
echo "${user_roles[alice]}" # admin
# Check if key exists
if [[ -v user_roles[david] ]]; then
echo "David exists"
else
echo "David not found" # This prints
fi
# Get all keys
echo "${!user_roles[@]}" # alice bob charlie
# Get all values
echo "${user_roles[@]}" # admin developer viewer
# Number of entries
echo "${#user_roles[@]}" # 3
The [[ -v array[key] ]] test checks key existence without triggering errors for missing keys.
Array Manipulation and Iteration
Iteration patterns differ slightly between array types:
# Indexed array iteration
servers=(web1 web2 web3)
for server in "${servers[@]}"; do
echo "Processing: $server"
done
# With indices
for i in "${!servers[@]}"; do
echo "Index $i: ${servers[$i]}"
done
# Associative array iteration
declare -A ports=([http]="80" [https]="443" [ssh]="22")
# Iterate over values
for port in "${ports[@]}"; do
echo "Port: $port"
done
# Iterate over keys
for protocol in "${!ports[@]}"; do
echo "$protocol uses port ${ports[$protocol]}"
done
Array slicing extracts subsets:
numbers=(0 1 2 3 4 5 6 7 8 9)
# Syntax: ${array[@]:start:length}
echo "${numbers[@]:3:4}" # 3 4 5 6
echo "${numbers[@]:5}" # 5 6 7 8 9 (from index 5 to end)
echo "${numbers[@]: -3}" # 7 8 9 (last 3 elements, note the space)
Modifying and removing elements:
fruits=(apple banana cherry date)
# Modify element
fruits[1]="blueberry"
echo "${fruits[@]}" # apple blueberry cherry date
# Delete specific element
unset fruits[2]
echo "${fruits[@]}" # apple blueberry date
echo "${#fruits[@]}" # 3
# Delete entire array
unset fruits
Note that unset on an indexed array element creates a sparse array—the indices don’t shift.
Advanced Operations and Common Patterns
Pattern-based operations enable powerful transformations:
files=(report.txt data.csv script.sh notes.txt)
# Pattern substitution (replace first match)
echo "${files[@]/.txt/.bak}" # report.bak data.csv script.sh notes.bak
# Pattern removal
echo "${files[@]#*.}" # txt csv sh txt (remove shortest prefix)
echo "${files[@]%.*}" # report data script notes (remove shortest suffix)
Sorting requires external tools since Bash lacks native array sorting:
unsorted=(zebra apple mango banana)
# Sort using mapfile and process substitution
mapfile -t sorted < <(printf '%s\n' "${unsorted[@]}" | sort)
echo "${sorted[@]}" # apple banana mango zebra
# Numeric sort
numbers=(100 20 3 45)
mapfile -t sorted_nums < <(printf '%s\n' "${numbers[@]}" | sort -n)
echo "${sorted_nums[@]}" # 3 20 45 100
Checking element existence differs by array type:
# Indexed array - check value exists
servers=(web1 web2 web3)
search="web2"
for server in "${servers[@]}"; do
if [[ "$server" == "$search" ]]; then
echo "Found $search"
break
fi
done
# Associative array - check key exists
declare -A config=([debug]="true" [port]="8080")
if [[ -v config[debug] ]]; then
echo "Debug setting exists: ${config[debug]}"
fi
Converting between array types:
# Indexed to associative
indexed=(value1 value2 value3)
declare -A assoc
for i in "${!indexed[@]}"; do
assoc[key$i]="${indexed[$i]}"
done
# Associative to indexed (values only, keys lost)
declare -A source=([a]="first" [b]="second")
target=("${source[@]}")
Practical Applications and Best Practices
Here’s a complete script demonstrating practical array usage for log analysis:
#!/bin/bash
# Parse Apache access logs and generate statistics
declare -A status_counts
declare -A ip_requests
declare -a response_times
# Read log file
while IFS= read -r line; do
# Extract status code (simplified regex)
if [[ $line =~ \"[A-Z]+[[:space:]][^\"]+\"[[:space:]]([0-9]{3}) ]]; then
status="${BASH_REMATCH[1]}"
((status_counts[$status]++))
fi
# Extract IP address (first field)
ip="${line%% *}"
((ip_requests[$ip]++))
# Extract response time (last field, assuming microseconds)
response_time="${line##* }"
[[ $response_time =~ ^[0-9]+$ ]] && response_times+=("$response_time")
done < access.log
# Report status code distribution
echo "=== Status Code Distribution ==="
for status in "${!status_counts[@]}"; do
echo "HTTP $status: ${status_counts[$status]} requests"
done | sort
# Top 5 requesting IPs
echo -e "\n=== Top 5 IP Addresses ==="
for ip in "${!ip_requests[@]}"; do
echo "${ip_requests[$ip]} $ip"
done | sort -rn | head -5
# Calculate average response time
if [[ ${#response_times[@]} -gt 0 ]]; then
total=0
for time in "${response_times[@]}"; do
((total += time))
done
avg=$((total / ${#response_times[@]}))
echo -e "\n=== Performance ==="
echo "Average response time: ${avg}μs"
echo "Total requests: ${#response_times[@]}"
fi
Best practices to follow:
- Always quote array expansions: Use
"${array[@]}"not${array[@]}to handle elements with spaces correctly - Declare associative arrays explicitly: Use
declare -Abefore assignment to avoid indexed array confusion - Check Bash version requirements: Test for Bash 4.0+ when using associative arrays in production scripts
- Use
readarray/mapfilefor files: Safer than command substitution for reading file contents into arrays - Validate array indices: Check bounds or use
[[ -v array[index] ]]before accessing elements - Prefer
[@]over[*]: The@form preserves element boundaries;*joins into a single string - Initialize arrays before loops: Prevent inheriting values from environment or previous script runs
Arrays unlock sophisticated data handling in Bash. Master these patterns and your scripts will handle complex data transformations that previously required external tools like awk or Python.