You can query databases from bash scripts more easily than you might think, but the most surprising thing is how little explicit configuration most tools require when connecting to common databases like PostgreSQL or MySQL.
Let’s see it in action. Imagine you have a PostgreSQL database running locally on the default port (5432) and you want to fetch a list of users from a users table.
#!/bin/bash
DB_USER="myuser"
DB_NAME="mydb"
DB_HOST="localhost"
DB_PORT="5432"
# Using psql command-line client
psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -c "SELECT username, email FROM users WHERE active = TRUE;"
If you run this script (after setting myuser and mydb appropriately and ensuring the users table exists with username, email, and active columns), you’ll get output like this:
username | email
-------------+----------------------
alice | alice@example.com
bob | bob@example.com
(2 rows)
This works because tools like psql (for PostgreSQL) and mysql (for MySQL) are designed to be invoked directly from the command line, accepting connection parameters and SQL queries as arguments. They handle the underlying network communication, query parsing, and result formatting.
The fundamental problem these tools solve is bridging the gap between a shell environment, which is primarily text-based and sequential, and a relational database, which is stateful, structured, and handles complex queries. Bash scripts are excellent for automation, orchestration, and quick data manipulation, but they lack native SQL capabilities. Database clients like psql and mysql act as translators, allowing your bash script to send SQL commands and receive structured results that can then be further processed by bash.
Internally, when you run psql -c "SELECT ...", the psql executable establishes a TCP connection to the specified host and port. It then authenticates using the provided username (and potentially a password, which can be handled via environment variables like PGPASSWORD or interactive prompts, though for scripting, direct password input is generally discouraged for security reasons). Once connected, it sends the SQL query string to the database server. The server processes the query, retrieves the data, and sends it back to psql. psql then formats this data into a human-readable table and prints it to standard output, which is exactly what your bash script sees.
The key levers you control are the connection parameters (-h, -p, -U, -d) and the SQL query itself (-c). For more complex scripts, you might redirect input from a file containing multiple SQL statements or use the -t (tuples only) flag with psql to get cleaner output without headers and footers, making it easier to parse in bash. For example, psql -t -c "SELECT username FROM users;" would output just the usernames, one per line.
When dealing with databases that require authentication, using environment variables for credentials is often the most script-friendly approach. For PostgreSQL, you can set PGPASSWORD="your_password" before running your psql command. For MySQL, it’s MYSQL_PWD="your_password". However, be mindful that these variables are visible in the environment and can be a security risk if not managed carefully. A more robust solution for sensitive credentials involves using dedicated secrets management tools or database-specific authentication methods like .pgpass files for PostgreSQL.
The real power comes when you combine this with bash’s text processing capabilities. You can pipe the output of a psql or mysql command into tools like grep, awk, or sed, or loop through the results for further actions. For instance, to process each username returned by the query:
#!/bin/bash
DB_USER="myuser"
DB_NAME="mydb"
DB_HOST="localhost"
DB_PORT="5432"
# Get only the usernames, one per line, and loop through them
psql -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -d "$DB_NAME" -t -c "SELECT username FROM users WHERE active = TRUE;" | while IFS= read -r username; do
echo "Processing user: $username"
# Add more commands here to do something with each username
done
This pattern, where you extract data from the database and then iterate over it within your bash script, is incredibly common and powerful for automating administrative tasks.
Most people aren’t aware that many database client tools support a "file" or "source" option, allowing you to execute an entire SQL script at once. For psql, this is done by redirecting the input: psql -h host -U user -d db < /path/to/your/script.sql. This is far more efficient for executing multiple statements than calling psql -c repeatedly in a loop, as it establishes a single connection and sends all commands in one go.
Once you’ve mastered querying, the next logical step is to consider how to handle data insertion and updates from your bash scripts, and the associated challenges of error handling and transaction management.