⚡ Lesson 14: Indexes & Query Optimization
Your queries work — but are they fast? When a table has 10 rows, everything feels instant. When it has 10 million, a missing index can turn a sub-second lookup into a 30-second crawl. Indexes are the single most important performance tool in your MySQL toolkit. In this lesson, you'll learn how they work, when to create them, and how to use EXPLAIN to diagnose slow queries like a pro.
🎯 Learning Objectives
By the end of this lesson, you will be able to:
- Explain what an index is and why it matters for performance
- Create single-column and composite (multi-column) indexes
- Use
EXPLAINto analyze how MySQL executes a query - Identify when indexes help and when they hurt
- Apply best practices for index design
- Recognize and fix common performance pitfalls
Estimated Time: 45 minutes
Prerequisites: JOINs (Lesson 13)
📑 In This Lesson
The Problem: Full Table Scans
Without an index, MySQL has one strategy for finding data: read every single row in the table and check if it matches your condition. This is called a full table scan.
-- Find a customer by email
SELECT * FROM customers WHERE email = 'alice@example.com';
With 10 rows, this is fine — MySQL checks all 10 and responds instantly. But with 5 million rows, MySQL reads all 5 million to find one match. That's like reading every page of an encyclopedia to find one topic instead of using the index in the back.
What Is an Index?
A database index is a separate data structure (usually a B-tree) that stores a sorted copy of one or more columns along with pointers back to the full rows. Think of it like the index at the back of a textbook — instead of flipping through every page, you look up "JOINs" in the index, find "page 247," and go straight there.
How It Speeds Things Up
Instead of scanning every row, MySQL navigates the B-tree in O(log n) time. For a table with 1 million rows, a full scan checks 1,000,000 rows. A B-tree lookup checks roughly 20. That's the difference between waiting 5 seconds and waiting 5 milliseconds.
💡 Primary Keys Are Already Indexed
Every table's PRIMARY KEY is automatically indexed. That's why SELECT * FROM customers WHERE id = 42 is always fast — MySQL uses the primary key index (called the clustered index) to jump directly to row 42. You never need to create an index on a primary key.
The Trade-Off
Indexes aren't free. Every index:
- Uses disk space — the sorted copy takes up storage
- Slows down writes — every INSERT, UPDATE, or DELETE must also update each relevant index
- Needs maintenance — MySQL keeps indexes in sync automatically, but more indexes = more work per write
The goal is to index columns you search on frequently without indexing everything blindly.
Creating Indexes
Syntax
-- Create an index on a single column
CREATE INDEX index_name ON table_name (column_name);
-- Example: index the email column on customers
CREATE INDEX idx_customers_email ON customers (email);
Naming Convention
A widely used pattern is idx_tablename_columnname:
idx_customers_emailidx_orders_customer_ididx_books_author_id
The name is for your reference — MySQL doesn't care what you call it, but consistent naming makes maintenance much easier.
Index on a Foreign Key
Foreign key columns are prime candidates for indexes because you JOIN and filter on them constantly:
-- Index the foreign key columns we use in JOINs
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
CREATE INDEX idx_books_author_id ON books (author_id);
CREATE INDEX idx_order_items_order_id ON order_items (order_id);
CREATE INDEX idx_order_items_book_id ON order_items (book_id);
💡 MySQL May Auto-Index Foreign Keys
When you define a FOREIGN KEY constraint, MySQL (InnoDB) automatically creates an index on the foreign key column if one doesn't already exist. So you may already have these indexes! Use SHOW INDEX FROM table_name to check.
Creating an Index During Table Creation
You can also define indexes inside a CREATE TABLE statement:
CREATE TABLE customers (
id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL,
city VARCHAR(100),
INDEX idx_customers_email (email),
INDEX idx_customers_city (city)
);
Composite (Multi-Column) Indexes
A composite index covers two or more columns. It's useful when you frequently filter or sort by a specific combination of columns.
-- Index on last_name + first_name (together)
CREATE INDEX idx_customers_name ON customers (last_name, first_name);
Column Order Matters — The "Left Prefix" Rule
A composite index on (last_name, first_name) can satisfy queries that filter on:
| Query Filters On | Uses Index? | Why? |
|---|---|---|
WHERE last_name = 'Smith' |
✅ Yes | Matches the leftmost column |
WHERE last_name = 'Smith' AND first_name = 'Bob' |
✅ Yes | Matches both columns (full index) |
WHERE first_name = 'Bob' |
❌ No | Skips the leftmost column — can't use index |
Think of it like a phone book sorted by last name, then first name. You can quickly find all the "Smiths" (left column). You can quickly find "Bob Smith" (both columns). But you can't efficiently find all the "Bobs" without scanning every entry — the book isn't sorted by first name alone.
⚠️ The Left Prefix Rule
A composite index on (A, B, C) can serve queries filtering on: A, (A, B), or (A, B, C). It cannot serve queries that skip A entirely (like filtering only on B or C). Always put the most-filtered column first.
Real-World Example
-- If you frequently run queries like:
SELECT * FROM orders
WHERE customer_id = 5 AND order_date >= '2026-01-01';
-- Create a composite index matching those columns:
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date);
This index helps MySQL quickly find all orders for customer 5, then narrow down by date — without scanning the whole table.
Unique Indexes
A unique index enforces uniqueness on a column (or combination of columns) while also providing the performance benefits of a regular index. It's essentially a constraint and an index in one.
-- Ensure no two customers share the same email
CREATE UNIQUE INDEX idx_customers_email_unique
ON customers (email);
-- Alternative: defined inside CREATE TABLE
CREATE TABLE customers (
id INT AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(100) NOT NULL,
UNIQUE INDEX idx_customers_email_unique (email)
);
If you try to INSERT a duplicate email, MySQL rejects it with an error — just like a UNIQUE constraint. The difference is that you're also getting index performance for lookups on that column.
💡 UNIQUE Constraint vs UNIQUE INDEX
In MySQL, UNIQUE constraints are implemented as unique indexes internally. Writing UNIQUE (email) or UNIQUE INDEX idx_email (email) in a CREATE TABLE statement produces the same result. They're interchangeable.
EXPLAIN — Your Query Microscope
How do you know if MySQL is actually using your index? Put EXPLAIN in front of any SELECT statement and MySQL shows you its execution plan — exactly how it intends to find your data.
EXPLAIN SELECT * FROM customers WHERE email = 'alice@example.com';
Example Output (simplified):
+----+------+----------------------+----------------------+---------+-------+------+-------+
| id | type | possible_keys | key | key_len | ref | rows | Extra |
+----+------+----------------------+----------------------+---------+-------+------+-------+
| 1 | ref | idx_customers_email | idx_customers_email | 402 | const | 1 | NULL |
+----+------+----------------------+----------------------+---------+-------+------+-------+
Key Columns in EXPLAIN Output
| Column | What It Tells You | What to Look For |
|---|---|---|
type |
How MySQL accesses the table | const, ref, range = good. ALL = full table scan (bad) |
possible_keys |
Which indexes MySQL could use | Should list relevant indexes. NULL = no usable indexes |
key |
Which index MySQL actually chose | Should show your index name. NULL = no index used |
rows |
Estimated number of rows MySQL will examine | Lower is better. Compare to total table rows |
Extra |
Additional info about the strategy | Using index = great. Using filesort = extra work for sorting |
The type Column — From Best to Worst
| Type | Meaning | Speed |
|---|---|---|
system / const |
At most one matching row (primary key or unique lookup) | ⚡ Fastest |
eq_ref |
One row per join (unique index in a JOIN) | ⚡ Excellent |
ref |
Multiple matching rows via non-unique index | ✅ Good |
range |
Index range scan (BETWEEN, <, >, IN) | ✅ Good |
index |
Full index scan (reads entire index, not table) | ⚠️ Okay |
ALL |
Full table scan — reads every row | 🐌 Slowest |
Before and After: Seeing the Impact
-- BEFORE index: full table scan
EXPLAIN SELECT * FROM customers WHERE city = 'Las Vegas';
-- type: ALL, rows: 50000 (checks every row)
-- Create the index
CREATE INDEX idx_customers_city ON customers (city);
-- AFTER index: index lookup
EXPLAIN SELECT * FROM customers WHERE city = 'Las Vegas';
-- type: ref, key: idx_customers_city, rows: 342 (checks only matches)
✅ The EXPLAIN Habit
Get in the habit of running EXPLAIN on any query you think might be slow. Look for: type: ALL (full table scan), key: NULL (no index used), or a high rows count. These are your optimization targets.
When to Index (and When Not To)
✅ DO Create Indexes On:
| Scenario | Why |
|---|---|
Columns in WHERE clauses |
These are your filter conditions — the most common use |
| Foreign key columns | Used in JOINs constantly (MySQL may auto-create these) |
Columns in ORDER BY |
Avoids expensive "filesort" operations |
Columns in GROUP BY |
Speeds up grouping and aggregation |
| Columns used in JOIN conditions | Helps MySQL match rows between tables |
| High-cardinality columns (many unique values) | Email, username, phone — index narrows the search dramatically |
❌ DON'T Create Indexes On:
| Scenario | Why Not |
|---|---|
| Tiny tables (under ~1,000 rows) | Full scan is already fast — index overhead isn't worth it |
| Low-cardinality columns (few unique values) | A boolean is_active column with only TRUE/FALSE — index can't narrow much |
| Columns you rarely filter or sort on | No queries use them = wasted space and write overhead |
| Tables with heavy INSERT/UPDATE/DELETE | Every write updates every index — too many indexes slow writes |
| Columns with lots of NULLs | Indexes on mostly-NULL columns provide little benefit |
💡 The Sweet Spot
Most tables do well with indexes on: the primary key (automatic), all foreign keys (often automatic), and 1–3 additional columns that appear in your most frequent WHERE, ORDER BY, or JOIN conditions. Start there and add more only when EXPLAIN shows a problem.
Index Killers — What Prevents Index Use
You created an index, but MySQL isn't using it? These common patterns prevent MySQL from using indexes:
1. Functions on Indexed Columns
-- ❌ BAD: Function wraps the indexed column — index can't be used
SELECT * FROM orders WHERE YEAR(order_date) = 2026;
-- ✅ GOOD: Rewrite as a range — index works
SELECT * FROM orders
WHERE order_date >= '2026-01-01' AND order_date < '2027-01-01';
2. Leading Wildcards in LIKE
-- ❌ BAD: Leading wildcard — full scan required
SELECT * FROM customers WHERE email LIKE '%@gmail.com';
-- ✅ GOOD: Trailing wildcard — index works
SELECT * FROM customers WHERE email LIKE 'alice%';
Leading % means "starts with anything" — MySQL can't use the sorted index because it doesn't know where to start looking. Trailing % is fine because the index is sorted by the beginning of the string.
3. Implicit Type Conversion
-- ❌ BAD: phone is VARCHAR, but you're comparing to an integer
SELECT * FROM customers WHERE phone = 5551234;
-- ✅ GOOD: Compare with matching type
SELECT * FROM customers WHERE phone = '5551234';
When MySQL has to convert types, it may wrap the column in a conversion function internally, breaking index usage.
4. OR Conditions on Different Columns
-- ❌ Potentially slow: OR across different columns
SELECT * FROM customers WHERE city = 'Las Vegas' OR last_name = 'Smith';
-- ✅ Better: Use UNION if both columns are indexed
SELECT * FROM customers WHERE city = 'Las Vegas'
UNION
SELECT * FROM customers WHERE last_name = 'Smith';
5. Negation
-- ❌ Often can't use index:
SELECT * FROM customers WHERE city != 'Las Vegas';
SELECT * FROM customers WHERE city NOT IN ('Las Vegas', 'Reno');
-- MySQL must scan to find all the NON-matches
⚠️ When in Doubt, EXPLAIN
Not sure if your query uses an index? Don't guess — run EXPLAIN and check the key and type columns. That's the definitive answer.
Managing Indexes
View Indexes on a Table
-- See all indexes on a table
SHOW INDEX FROM customers;
-- Shorter alternative
SHOW INDEXES FROM orders;
Drop an Index
-- Remove an index you no longer need
DROP INDEX idx_customers_city ON customers;
-- Alternative syntax
ALTER TABLE customers DROP INDEX idx_customers_city;
Rename / Rebuild
MySQL doesn't have a RENAME INDEX command (prior to 5.7). In older versions, drop and recreate. In 5.7+, you can use:
-- MySQL 5.7+
ALTER TABLE customers RENAME INDEX idx_old_name TO idx_new_name;
💡 Check Before You Create
Before creating a new index, run SHOW INDEX FROM table_name to see what already exists. Duplicate indexes waste space and slow down writes for no benefit.
General Query Optimization Tips
Indexes are the biggest lever, but these habits also help:
1. Select Only What You Need
-- ❌ BAD: Fetches all columns (even ones you don't use)
SELECT * FROM orders;
-- ✅ GOOD: Fetch only what you need
SELECT id, order_date, customer_id FROM orders;
Less data to read, transfer, and process. This matters at scale.
2. Use LIMIT for Large Results
-- Don't pull 100,000 rows if you need the top 10
SELECT title, price FROM books ORDER BY price DESC LIMIT 10;
3. Avoid SELECT DISTINCT When Unnecessary
-- DISTINCT forces MySQL to sort/deduplicate — expensive on big tables
-- Only use it when you truly have duplicates to remove
SELECT DISTINCT city FROM customers; -- Fine: you want unique cities
SELECT DISTINCT * FROM orders; -- Usually a design smell
4. Use EXISTS Instead of IN for Subqueries
-- ❌ IN can be slow with large subquery results
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders);
-- ✅ EXISTS short-circuits — stops as soon as it finds a match
SELECT * FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);
5. Keep JOINs Efficient
-- Ensure JOIN columns are indexed and use matching types
-- Index on orders.customer_id helps this JOIN enormously
SELECT c.first_name, o.order_date
FROM customers c
INNER JOIN orders o ON c.id = o.customer_id;
✅ Performance Mindset
Don't optimize prematurely — write correct queries first, then use EXPLAIN to find actual bottlenecks. Index the columns that EXPLAIN tells you need it, not every column "just in case."
Exercises
🏋️ Exercise 1: Create Indexes
Write SQL statements to:
- Create an index on the
citycolumn of thecustomerstable - Create a composite index on
(customer_id, order_date)on theorderstable - Create a unique index on the
emailcolumn of thecustomerstable
✅ Solution
-- 1
CREATE INDEX idx_customers_city ON customers (city);
-- 2
CREATE INDEX idx_orders_customer_date ON orders (customer_id, order_date);
-- 3
CREATE UNIQUE INDEX idx_customers_email_unique ON customers (email);
🏋️ Exercise 2: EXPLAIN Analysis
For each query below, predict whether MySQL will use an index or perform a full table scan. Then verify by running EXPLAIN.
SELECT * FROM customers WHERE id = 5;SELECT * FROM customers WHERE city = 'Denver';(assume no index on city)SELECT * FROM orders WHERE customer_id = 3;(assume foreign key exists)SELECT * FROM customers WHERE UPPER(email) = 'ALICE@EXAMPLE.COM';
✅ Solution
-- 1: Uses PRIMARY key index → type: const
EXPLAIN SELECT * FROM customers WHERE id = 5;
-- 2: No index on city → type: ALL (full table scan)
EXPLAIN SELECT * FROM customers WHERE city = 'Denver';
-- 3: Foreign key auto-index → type: ref, key: customer_id index
EXPLAIN SELECT * FROM orders WHERE customer_id = 3;
-- 4: UPPER() wraps the column → index on email CAN'T be used → type: ALL
-- Fix: WHERE email = 'alice@example.com' (MySQL collation is
-- case-insensitive by default, so UPPER is unnecessary!)
EXPLAIN SELECT * FROM customers WHERE UPPER(email) = 'ALICE@EXAMPLE.COM';
🏋️ Exercise 3: Fix the Slow Queries
Each query below has a performance problem. Rewrite it to be index-friendly:
SELECT * FROM orders WHERE YEAR(order_date) = 2026;SELECT * FROM customers WHERE email LIKE '%smith%';SELECT * FROM customers WHERE phone = 5551234;(phone is VARCHAR)
✅ Solution
-- 1: Replace YEAR() with a range
SELECT * FROM orders
WHERE order_date >= '2026-01-01' AND order_date < '2027-01-01';
-- 2: Leading wildcard can't use index — if you must search
-- mid-string, consider a FULLTEXT index instead:
-- CREATE FULLTEXT INDEX idx_email_ft ON customers (email);
-- SELECT * FROM customers WHERE MATCH(email) AGAINST('smith');
-- Otherwise, restructure the search (e.g., add a domain column)
-- 3: Match the column type — compare VARCHAR to a string
SELECT * FROM customers WHERE phone = '5551234';
🏋️ Exercise 4: Index Design Challenge
You have an employees table with columns: id, first_name, last_name, department, hire_date, salary, is_active.
The most common queries are:
WHERE department = 'Engineering' AND is_active = TRUEWHERE last_name = 'Smith'WHERE hire_date BETWEEN '2025-01-01' AND '2025-12-31'ORDER BY salary DESC LIMIT 10
What indexes would you create? Justify each choice.
✅ Solution
-- For query 1: composite index — department first (higher cardinality)
CREATE INDEX idx_emp_dept_active ON employees (department, is_active);
-- For query 2: single column index
CREATE INDEX idx_emp_last_name ON employees (last_name);
-- For query 3: index on hire_date for range scan
CREATE INDEX idx_emp_hire_date ON employees (hire_date);
-- For query 4: index on salary for sorted access
CREATE INDEX idx_emp_salary ON employees (salary);
-- Note: is_active alone (boolean) wouldn't be a good index —
-- too few unique values. But as the second column in a
-- composite index with department, it helps narrow results
-- after department is matched.
🎯 Quick Quiz
Question 1: What happens without an index on a WHERE column?
Question 2: For a composite index on (A, B, C), which WHERE clause can use it?
Question 3: In EXPLAIN output, which type value indicates the worst performance?
Question 4: Why does WHERE YEAR(order_date) = 2026 prevent index usage?
Summary
🎉 Key Takeaways
- An index is a sorted data structure (B-tree) that lets MySQL find rows without scanning the entire table
- Primary keys are automatically indexed; foreign keys usually are too
CREATE INDEX idx_name ON table (column)adds a single-column index- Composite indexes cover multiple columns — column order matters (left prefix rule)
- Unique indexes enforce uniqueness while providing index performance
EXPLAINshows the query execution plan — use it to verify index usage- Look for
type: ALL(bad) andkey: NULL(no index used) in EXPLAIN output - Index killers: functions on columns, leading wildcards, type mismatches, OR across columns
- Index columns in WHERE, JOIN ON, ORDER BY, and GROUP BY clauses
- Don't over-index — each index slows down writes and uses disk space
- Use
SHOW INDEX FROM tableto see existing indexes
Quick Reference
| Command | Purpose |
|---|---|
CREATE INDEX idx ON tbl (col) |
Create a single-column index |
CREATE INDEX idx ON tbl (a, b) |
Create a composite index |
CREATE UNIQUE INDEX idx ON tbl (col) |
Create a unique index |
DROP INDEX idx ON tbl |
Remove an index |
SHOW INDEX FROM tbl |
View all indexes on a table |
EXPLAIN SELECT ... |
Show query execution plan |
📚 Additional Resources
- MySQL Docs — Optimization and Indexes
- MySQL Docs — Understanding the Query Execution Plan
- MySQL Docs — CREATE INDEX Statement
- MySQL Docs — Multiple-Column Indexes
🚀 What's Next?
Now that you can make queries fast, it's time to make complex queries simple. In Lesson 15: Views, you'll learn to save complex multi-table JOINs as reusable virtual tables — giving your colleagues (and your future self) clean, readable access to data without rewriting 10-line queries every time.