Bash’s built-in [[ ... ]] conditional construct is where you’ll most often encounter regular expression pattern matching.

#!/bin/bash

filename="report_2023-10-27.txt"

if [[ "$filename" =~ ^report_[0-9]{4}-[0-9]{2}-[0-9]{2}\.txt$ ]]; then
  echo "Filename matches the expected pattern."
else
  echo "Filename does not match the expected pattern."
fi

This script checks if the filename variable conforms to a specific date-stamped format. The =~ operator is the key here, signaling that the right-hand side is a regular expression. The pattern ^report_[0-9]{4}-[0-9]{2}-[0-9]{2}\.txt$ breaks down as follows: ^ anchors the match to the start of the string, report_ matches the literal string, [0-9]{4} matches exactly four digits (for the year), -[0-9]{2} matches a hyphen followed by exactly two digits (for month and day), and \.txt matches the literal .txt (the backslash escapes the dot, which otherwise means "any character"). Finally, $ anchors the match to the end of the string, ensuring the entire filename must match.

Bash’s regular expression engine is ERE (Extended Regular Expressions), which is a bit more powerful than the basic regular expressions found in older tools like grep -G. This means you get features like + for one or more occurrences, ? for zero or one, and | for alternation, directly within [[ ... ]].

Let’s look at another example, capturing parts of the matched string. The BASH_REMATCH array variable is populated when a match occurs.

#!/bin/bash

log_line="INFO: User 'alice' logged in from 192.168.1.100"
pattern="^([A-Z]+): User '([a-z]+)' logged in from ([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)$"

if [[ "$log_line" =~ $pattern ]]; then
  log_level="${BASH_REMATCH[1]}"
  username="${BASH_REMATCH[2]}"
  ip_address="${BASH_REMATCH[3]}"

  echo "Log Level: $log_level"
  echo "Username: $username"
  echo "IP Address: $ip_address"
else
  echo "Log line did not match pattern."
fi

Here, the parentheses () in the pattern define capturing groups. BASH_REMATCH[0] will contain the entire matched string. BASH_REMATCH[1] will contain the content of the first capturing group (the log level), BASH_REMATCH[2] the second (the username), and BASH_REMATCH[3] the third (the IP address). This is incredibly useful for parsing structured text data directly within your scripts without needing external tools.

The key to effective regex in Bash is understanding that the pattern is treated as a string unless it’s on the right side of =~ within [[ ... ]]. If you try to use a regex pattern in a standard [ ... ] or test command, it will be interpreted literally, and your special characters will not have their intended meaning.

For example, this will likely not work as expected:

# This is incorrect for regex matching
if [ "$filename" == ^report_[0-9]{4}-[0-9]{2}-[0-9]{2}\.txt$ ]; then
  echo "This probably won't print."
fi

The == operator performs glob-style pattern matching, not regular expression matching. While globs can be powerful for filenames, they lack the expressiveness of regex for complex text parsing.

When dealing with potentially complex or dynamic regex patterns, it’s often best to store them in a variable first, as shown in the log_line example. This improves readability and maintainability. Remember to quote your variables when used within [[ ... ]] to prevent word splitting and unexpected pathname expansion issues, especially if the variable contents might contain spaces or special shell characters. The =~ operator itself doesn’t perform word splitting on the right-hand side, but it’s good practice to quote the left-hand side variable.

The most surprising thing about Bash’s regex implementation is that it doesn’t support lookarounds (positive or negative lookahead/lookbehind assertions), which are common in other regex engines like Perl or Python. This means you can’t easily match a pattern only if it’s followed by or preceded by another pattern without consuming those characters. You’ll often have to resort to capturing groups and then filtering or re-matching if you need to express such conditions.

The next step in mastering Bash text processing is understanding how to combine these pattern-matching capabilities with other shell tools like sed and awk for more advanced transformations and data manipulation.

Want structured learning?

Take the full Bash course →