MySQL’s query optimizer is usually pretty smart, but sometimes it gets it wrong and picks a terrible execution plan. Index hints are your way of telling it, "Hey, I know better, use this index."
Let’s see it in action. Imagine we have a users table:
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_username ON users (username);
CREATE INDEX idx_email ON users (email);
Now, suppose we have a query that’s performing poorly:
SELECT * FROM users WHERE username = 'alice';
If the optimizer mistakenly decides to use idx_email (maybe because it thinks email is more selective, which it might not be for a specific username), it will be slow. We know idx_username is the right choice.
We can force it with an index hint:
SELECT * FROM users
USE INDEX (idx_username)
WHERE username = 'alice';
Or, if we want to be even more explicit and say "only use this index, don’t even consider others":
SELECT * FROM users
FORCE INDEX (idx_username)
WHERE username = 'alice';
USE INDEX suggests an index, allowing the optimizer to still consider other indexes if it thinks they might be better. FORCE INDEX tells the optimizer to use the specified index and ignore all others. This is a much stronger directive.
The actual execution plan can be seen using EXPLAIN:
EXPLAIN SELECT * FROM users WHERE username = 'alice';
Without hints, the key column in the EXPLAIN output might show idx_email or NULL if it’s doing a full table scan. With USE INDEX (idx_username) or FORCE INDEX (idx_username), the key column will definitively show idx_username.
There are a few types of hints:
USE INDEX: Suggests which indexes to consider.FORCE INDEX: Forces the use of a specific index.IGNORE INDEX: Tells the optimizer to not use certain indexes.
You can also hint multiple indexes. For example, if you have a composite index (col1, col2) and a query filters on col1, you might want to hint idx_col1_col2.
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
category_id INT,
price DECIMAL(10, 2)
);
CREATE INDEX idx_category_price ON products (category_id, price);
CREATE INDEX idx_name ON products (name);
-- This query might be slow if optimizer doesn't pick idx_category_price
SELECT * FROM products WHERE category_id = 5 AND price > 100.00;
-- Forcing the composite index
SELECT * FROM products
FORCE INDEX (idx_category_price)
WHERE category_id = 5 AND price > 100.00;
The optimizer’s cost model is based on statistics. If those statistics are stale or inaccurate, it can lead to bad choices. For instance, if cardinality estimates are off, it might pick an index that appears more selective than it actually is for a given query. Index hints bypass this flawed estimation process.
Consider a scenario with a compound index on (user_id, order_date). If you query for orders within a specific date range for a single user, the optimizer should ideally use this index. However, if it sees a very large number of orders for that user_id and thinks scanning the index is more expensive than a full table scan (perhaps due to incorrect cardinality estimates for the date range), it might opt for the wrong plan.
CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
CREATE INDEX idx_user_date ON orders (user_id, order_date);
CREATE INDEX idx_user ON orders (user_id); -- Less specific
-- Suppose this query is slow, optimizer chose idx_user or a table scan
SELECT * FROM orders
WHERE user_id = 12345 AND order_date BETWEEN '2023-01-01' AND '2023-12-31';
-- Force the compound index
SELECT * FROM orders
FORCE INDEX (idx_user_date)
WHERE user_id = 12345 AND order_date BETWEEN '2023-01-01' AND '2023-12-31';
When you use FORCE INDEX, you are essentially telling MySQL, "Trust me, I know that idx_user_date is the most efficient way to satisfy this WHERE clause, even if your internal calculations suggest otherwise." This is powerful when you’ve done your own performance testing and profiling and identified a clear winner that the optimizer isn’t picking.
The IGNORE INDEX hint is useful when you have multiple indexes that could be used, but you know one of them is detrimental to performance for a specific query. For example, if you have an index on (colA) and a composite index (colA, colB), and a query filters on colA but you don’t want it to use the (colA, colB) index (perhaps because it’s too broad for this specific filter), you can ignore it.
SELECT * FROM my_table
IGNORE INDEX (idx_colA_colB)
WHERE colA = 'some_value';
The most common reason for needing index hints is when the query optimizer’s statistics are outdated or inaccurate. Running ANALYZE TABLE my_table; can sometimes fix this by updating the statistics, making the optimizer more likely to choose the correct index without hints. However, if statistics are inherently difficult to maintain or the cost model has limitations for complex queries, hints become a necessary fallback.
One subtle point is how hints interact with JOIN clauses. You can specify hints for each table involved in a join independently.
SELECT
o.order_id,
u.username
FROM
orders o USE INDEX (idx_user_date)
JOIN
users u FORCE INDEX (idx_username) ON o.user_id = u.id
WHERE
o.order_date = '2023-10-26';
This tells MySQL to use idx_user_date for the orders table scan and idx_username for looking up users during the join.
The next thing you’ll likely run into is how to handle situations where the optimizer still gets it wrong even with hints, or how to manage hints across many queries and schema changes without creating a maintenance nightmare.