Bash’s subshells and command substitution are surprisingly powerful tools for dynamic shell scripting, allowing you to weave the output of one command directly into another as if it were typed manually.
Let’s see this in action. Imagine you need to create a directory for each .txt file in your current directory.
# Create dummy files
touch file1.txt file2.txt another.txt
# The goal: create directories named after each .txt file
# Incorrect approach:
# for file in *.txt; do mkdir $file; done # This would try to create directories named "file1.txt", etc.
# Correct approach using command substitution
for file in *.txt; do
# Extract the base name without the extension
base_name=$(basename "$file" .txt)
echo "Creating directory for: $base_name"
mkdir "$base_name"
done
# Verify
ls -d */
In this example, $(basename "$file" .txt) is command substitution. The basename command runs, and its output (file1, file2, another) is captured and assigned to the base_name variable. This lets us create directories like file1, file2, and another instead of file1.txt, file2.txt, and another.txt.
A subshell is created when you run a command in parentheses (). Any changes to environment variables or the current directory within a subshell are lost when it exits, making it a safe space for temporary operations. Command substitution is a specific use case of subshells (or equivalent processes) where the standard output of the commands within the parentheses is captured.
Here’s how it works internally. When Bash encounters $(command), it:
- Spawns a subshell: A new, independent Bash process is launched.
- Executes
command: The specified command runs within this subshell. - Captures STDOUT: The standard output of
commandis intercepted. - Replaces the substitution: The captured output replaces the
$(command)construct in the parent shell. - Subshell exits: The temporary process terminates.
This allows you to dynamically generate arguments, filenames, or even entire commands based on runtime conditions.
Consider another common scenario: finding all processes owned by a specific user and killing them.
# Assume 'someuser' is a valid username
TARGET_USER="someuser"
# Get PIDs of processes owned by TARGET_USER
pids=$(pgrep -u "$TARGET_USER" -x "my_process") # Example: find specific process
# Or to get all processes for a user:
# pids=$(ps -u "$TARGET_USER" -o pid=)
# If we found any PIDs
if [ -n "$pids" ]; then
echo "Killing processes for user: $TARGET_USER with PIDs: $pids"
# Use command substitution again to pass PIDs to kill
kill -9 $(echo "$pids")
else
echo "No processes found for user: $TARGET_USER"
fi
In kill -9 $(echo "$pids"), the echo "$pids" runs in a subshell. If pids contained 123 456, the echo command would output 123 456. This output then replaces $(echo "$pids"), so the command effectively becomes kill -9 123 456.
The syntax $(command) is generally preferred over the older backtick syntax `command` because it nests more cleanly and is easier to read. For example, `echo \`date\ `` is much harder to parse than $(echo $(date)).
You can also use command substitution to construct complex commands on the fly. Suppose you want to find all .log files modified in the last 24 hours and then compress them using gzip.
# Find log files modified in the last day
log_files=$(find . -name "*.log" -mtime -1)
# If any files were found, compress them
if [ -n "$log_files" ]; then
echo "Compressing: $log_files"
# This part is tricky: if filenames have spaces, this can break.
# A more robust solution would involve find -exec or xargs -0.
# But for demonstration of command substitution:
gzip $log_files
else
echo "No log files found modified in the last 24 hours."
fi
This demonstrates how output from find can be directly fed into gzip. Note the caution about filenames with spaces; gzip $log_files would treat file with spaces.log as three separate arguments. For robust handling, especially with filenames, find ... -print0 | xargs -0 ... or find ... -exec ... {} + are superior. However, command substitution is excellent for simple, space-delimited lists.
One subtle but powerful aspect is how command substitution handles globbing and word splitting. If the output of your command substitution contains characters that are interpreted by the shell (like *, ?, [, etc.), those characters will be expanded after the substitution occurs, unless the output is quoted. For instance, if $(echo "*.txt") were executed, and the current directory had a.txt and b.txt, the output would be a.txt b.txt, not the literal string *.txt.
This behavior, while often useful, can also be a source of bugs if not understood. If you intend to use the literal output of a command, ensure it’s quoted, like output="$(command)". However, if you want the shell to interpret the output as potential commands or file globs, unquoted command substitution is the way to go.
The next logical step is to combine command substitution with other shell features like arrays or process substitution for even more sophisticated data handling.