MySQL’s generated columns can be indexed, which is a game-changer for performance when you’re querying based on computed values.
Let’s say you have a table products and you want to store the full product name, which is a concatenation of brand and model.
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
brand VARCHAR(50),
model VARCHAR(50),
full_name VARCHAR(101) AS (CONCAT(brand, ' ', model)) STORED
);
Here, full_name is a generated column. The STORED keyword means the value is computed and physically stored in the table, making it indexable. If you used VIRTUAL, the value would be computed on the fly when read, which is generally not indexable (though there are some exceptions for InnoDB).
Now, imagine you frequently search for products by their full_name. Without an index, MySQL would have to scan the entire table.
SELECT * FROM products WHERE full_name = 'Acme Widget';
This is where indexing the generated column shines.
CREATE INDEX idx_full_name ON products (full_name);
With this index, MySQL can now use it to quickly locate rows where full_name matches 'Acme Widget', drastically reducing query time on large tables.
The magic behind STORED generated columns is that they behave like regular columns for indexing purposes. MySQL treats the pre-computed, stored value just like any other data in the table. When you insert or update a row, MySQL automatically calculates and stores the full_name. When you query, it can jump directly to the relevant index entry.
Consider a more complex scenario: calculating a discount price.
CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
item_price DECIMAL(10, 2),
discount_percentage DECIMAL(5, 2) DEFAULT 0.00,
final_price DECIMAL(10, 2) AS (item_price * (1 - discount_percentage / 100)) STORED
);
CREATE INDEX idx_final_price ON orders (final_price);
Now, if you want to find all orders with a final_price below a certain threshold, you can use the index.
SELECT * FROM orders WHERE final_price < 50.00;
The STORED attribute is key here. For VIRTUAL columns, the calculation happens at read time. While you can sometimes index VIRTUAL columns in InnoDB (MySQL 5.7+), it’s less straightforward and has performance implications for writes. STORED columns are always indexable and generally offer better read performance when indexed, at the cost of slightly larger storage and slower writes due to the pre-computation.
The choice between STORED and VIRTUAL depends on your workload. If the generated column is frequently read and queried, STORED with an index is usually the way to go. If it’s rarely read or only used for occasional checks, VIRTUAL might suffice to save storage space. However, for indexability, STORED is the reliable choice.
The computed value in a generated column can be a simple concatenation, a mathematical operation, or even call certain built-in functions. The important part for indexing is that the result is a deterministic value that can be stored.
A common pitfall is forgetting the STORED keyword when you intend to index the generated column. If you only specify VIRTUAL, you might find your index creation fails or doesn’t provide the expected performance boost, because the value isn’t physically present on disk to be indexed efficiently.
You can also create multi-column indexes involving generated columns, as long as they are STORED.
CREATE TABLE user_activity (
user_id INT,
activity_timestamp DATETIME,
log_message VARCHAR(255),
activity_date DATE AS (DATE(activity_timestamp)) STORED
);
CREATE INDEX idx_user_date_msg ON user_activity (user_id, activity_date, log_message(50));
This allows for efficient queries filtering by user and then by the extracted date, and even by a prefix of the log message.
The next step is understanding how generated columns can interact with foreign keys or be part of unique constraints.