Pipes and process substitution are two of Bash’s most powerful, yet often misunderstood, features for chaining commands together.

Let’s see them in action. Imagine you have a log file, access.log, and you want to find all requests for the /api/users endpoint that happened in the last hour, and then sort them by the IP address of the requester.

# First, a simple pipe:
cat access.log | grep "/api/users" | sort -k 1

# Now, let's add a time filter (assuming timestamps are in the first column and epoch format):
# Get the current time minus 3600 seconds (1 hour)
current_epoch=$(date +%s)
one_hour_ago=$((current_time - 3600))

# Use process substitution to pass the output of 'date' as a file to 'awk'
cat access.log | grep "/api/users" | awk -v cutoff="$one_hour_ago" '$1 > cutoff' | sort -k 1

In the first example, cat access.log | grep "/api/users" | sort -k 1, the pipe (|) takes the standard output of the command on its left and feeds it directly into the standard input of the command on its right. cat reads the file, grep filters lines containing /api/users, and sort sorts those lines based on the first field (presumably the timestamp).

Process substitution, on the other hand, is like a temporary file that a command can read from or write to. It uses <(...) for reading and >(...) for writing. In the second example, awk -v cutoff="$one_hour_ago" '$1 > cutoff', we’re using awk to filter lines where the timestamp (first field, $1) is greater than our calculated cutoff time.

The key insight here is that pipes connect streams of data, while process substitution connects commands by presenting their I/O as files. The awk command in the second example expects to read from standard input by default. However, if you wanted to compare the output of a command with a file, you’d use process substitution. For instance, if you had a list of forbidden IPs in forbidden_ips.txt and wanted to see which IP addresses in your access.log are in that list:

# Here, we use process substitution to make the output of 'cut' look like a file to 'grep'
grep -Fwf <(cut -d' ' -f1 access.log) forbidden_ips.txt

Here, <(cut -d' ' -f1 access.log) runs cut -d' ' -f1 access.log (which extracts the first field, the IP address, from each line of access.log), and Bash replaces <(...) with the path to a temporary file containing that output. grep -Fwf then treats this temporary file as its "fixed strings" pattern file, efficiently checking for matches against forbidden_ips.txt.

The most surprising true thing about pipes and process substitution is that they don’t inherently involve creating persistent temporary files on disk. Bash manages these as in-memory buffers or special file descriptors, making them incredibly efficient for real-time data processing without cluttering your filesystem. It’s a subtle distinction, but it means you can chain dozens of commands together without worrying about disk I/O overhead from intermediate files.

The exact levers you control are the standard input, output, and error streams of each process. By understanding how these streams are connected (or treated as files), you can construct complex data pipelines. For example, you can redirect standard error to standard output (2>&1) to capture all output, or redirect output to /dev/null to discard it.

The next concept you’ll likely explore is how to manage multiple background processes and their communication, often involving tools like xargs or more advanced shell scripting techniques.

Want structured learning?

Take the full Bash course →