Bash’s << operator for heredocs, while seemingly straightforward, is actually a surprisingly powerful and flexible tool for injecting multi-line strings into commands, often bypassing the usual shell-escaping headaches.
Let’s see it in action. Imagine you need to send a complex SQL query to psql:
DB_USER="myuser"
DB_NAME="mydb"
SQL_QUERY=$(cat <<EOF
SELECT
u.name,
COUNT(o.id) AS order_count
FROM
users u
LEFT JOIN
orders o ON u.id = o.user_id
WHERE
u.signup_date >= '2023-01-01'
GROUP BY
u.name
ORDER BY
order_count DESC
LIMIT 10;
EOF
)
psql -U "$DB_USER" -d "$DB_NAME" -c "$SQL_QUERY"
Here, cat <<EOF tells cat to read input until it encounters a line containing only EOF. Everything between the <<EOF line and that final EOF line becomes the standard input for cat. The output of cat (the multi-line string) is then captured by $(...) into the SQL_QUERY variable.
The core problem heredocs solve is managing strings that contain special characters like quotes, backticks, dollar signs, and whitespace, especially when those characters have specific meanings to the shell. Without heredocs, you’d be stuck with a mess of backslashes and escaped quotes, or multiple echo commands piped together.
Internally, the shell reads the heredoc delimiter (EOF in the example) and then reads all subsequent lines until it finds another line containing only that delimiter. Crucially, before it passes the content to the command, it performs variable expansion, command substitution ($(...)), and arithmetic expansion ($((...))) on the heredoc content unless the delimiter is quoted.
This brings us to a critical control lever: quoting the delimiter. If you write <<'EOF', the shell will not perform any expansions within the heredoc. This is incredibly useful when you need to pass literal strings, like configuration files or scripts, that might contain characters that would otherwise be interpreted by the shell.
Consider this example, where we need to embed a script that itself contains dollar signs:
SCRIPT_DIR="/opt/my_app/scripts"
# Notice the single quotes around the delimiter
INIT_SCRIPT=$(cat <<'END_SCRIPT'
#!/bin/bash
# This script initializes the application
echo "Setting up directories..."
mkdir -p ${SCRIPT_DIR}/logs ${SCRIPT_DIR}/data
echo "Done."
# The following variable should NOT be expanded by the outer shell
APP_PORT=8080
echo "Application will run on port $APP_PORT"
END_SCRIPT
)
echo "$INIT_SCRIPT"
In this case, ${SCRIPT_DIR} and $APP_PORT within the heredoc are treated as literal strings because 'END_SCRIPT' prevents shell expansion. If we had used END_SCRIPT without quotes, ${SCRIPT_DIR} would have been expanded to /opt/my_app/scripts, and $APP_PORT would have been expanded to 8080 (if APP_PORT was set in the outer shell), which is likely not what we wanted.
The exact moment the shell decides whether to expand variables or not is determined by the presence of quotes around the opening delimiter. If the delimiter is unquoted, expansion occurs. If it’s quoted (single, double, or even backslash-quoted), expansion is suppressed. This is a binary decision made at the start of processing the heredoc.
A common pitfall is forgetting that the closing delimiter must appear on a line by itself, with no leading or trailing whitespace. A line like EOF or EOF will not terminate the heredoc, leading to unexpected behavior or syntax errors when the shell tries to interpret subsequent input as part of the heredoc.
The next concept you’ll likely encounter is how heredocs interact with pipes and redirection when used with commands that don’t directly accept standard input, or when you need to send distinct heredocs to different parts of a pipeline.