🧱 Lesson 5: Data Types & Constraints
Every column in a table needs a data type — it tells MySQL what kind of information goes there and how much space to reserve. Constraints add rules that keep your data clean. Choosing wisely here means fewer bugs, faster queries, and happier databases.
🎯 Learning Objectives
By the end of this lesson, you will be able to:
- Choose the right numeric type for whole numbers, decimals, and money
- Choose the right string type for short text, long text, and fixed-length codes
- Store dates and times correctly with DATE, DATETIME, and TIMESTAMP
- Understand how MySQL handles BOOLEAN values
- Apply constraints:
PRIMARY KEY,NOT NULL,UNIQUE,DEFAULT,AUTO_INCREMENT, andCHECK - Know when to use
UNSIGNED,ENUM, andSET
Estimated Time: 45 minutes
Prerequisites: Creating databases and tables (Lesson 4)
📑 In This Lesson
Why Data Types Matter
You might wonder: why not just store everything as text? Three reasons:
e.g., 'hello' into an INT column"] C --> F["Numbers are compared faster
than text; dates can be sorted"] D --> G["INT uses 4 bytes
VARCHAR(10) uses up to 11 bytes
TEXT uses up to 65KB"]
Think of data types like containers: you wouldn't pour soup into a paper bag or store a feather in a shipping container. The right type holds the right data efficiently.
💡 Rule of Thumb: Always pick the smallest type that can comfortably hold your data. Smaller types = less disk space, faster queries, and more rows in memory.
Numeric Types
MySQL has several numeric types, divided into integers (whole numbers) and decimals (numbers with fractional parts).
Integer Types
| Type | Bytes | Signed Range | Unsigned Range | Use Case |
|---|---|---|---|---|
TINYINT |
1 | -128 to 127 | 0 to 255 | Age, rating (1-5), status codes |
SMALLINT |
2 | -32,768 to 32,767 | 0 to 65,535 | Year, small counts |
MEDIUMINT |
3 | -8.3M to 8.3M | 0 to 16.7M | Medium-sized IDs |
INT |
4 | -2.1B to 2.1B | 0 to 4.2B | Most IDs, counts, general use |
BIGINT |
8 | -9.2 quintillion to 9.2 quintillion | 0 to 18.4 quintillion | Huge IDs, social media post IDs |
-- Integer examples
CREATE TABLE integer_demo (
id INT AUTO_INCREMENT PRIMARY KEY,
age TINYINT UNSIGNED, -- 0-255, perfect for ages
year_born SMALLINT, -- up to 32,767
page_count INT UNSIGNED, -- no negative pages!
tweet_id BIGINT UNSIGNED -- Twitter IDs are huge numbers
);
UNSIGNED — Positive Only
Adding UNSIGNED after an integer type means it only stores positive numbers (and zero), but doubles the maximum value. Use it when negative values don't make sense — ages, counts, page numbers, IDs.
-- SIGNED (default): -128 to 127
-- UNSIGNED: 0 to 255
age TINYINT UNSIGNED -- Nobody has a negative age!
Decimal / Fractional Types
| Type | Precision | Use Case |
|---|---|---|
DECIMAL(M, D) |
Exact — M total digits, D after decimal | Money, financial calculations, anything requiring exact values |
FLOAT |
~7 significant digits (approximate) | Scientific measurements, coordinates (when slight imprecision is OK) |
DOUBLE |
~15 significant digits (approximate) | Higher-precision scientific calculations |
-- DECIMAL examples
price DECIMAL(8, 2), -- Up to 999,999.99
tax_rate DECIMAL(5, 4), -- Up to 9.9999 (e.g., 0.0825)
gpa DECIMAL(3, 2), -- Up to 9.99 (e.g., 3.85)
⚠️ Never Use FLOAT for Money
FLOAT and DOUBLE are approximate types — they can produce rounding errors. For example, 0.1 + 0.2 might equal 0.30000000000000004 in floating point. This is catastrophic for financial calculations. Always use DECIMAL for money.
💡 Reading DECIMAL(M, D)
M = total number of digits (both sides of the decimal point). D = digits after the decimal point. So DECIMAL(8, 2) means up to 6 digits before the decimal and 2 after it: 999999.99. The maximum value of M is 65, and the maximum of D is 30.
String Types
MySQL has many string types, but you'll use three most of the time: CHAR, VARCHAR, and TEXT.
| Type | Max Length | Storage | Use Case |
|---|---|---|---|
CHAR(N) |
255 characters | Always N bytes (padded with spaces) | Fixed-length: country codes, state abbreviations, hashes |
VARCHAR(N) |
65,535 characters | Actual length + 1-2 bytes | Variable-length: names, emails, titles, URLs |
TINYTEXT |
255 characters | Actual length + 1 byte | Short text (rarely used — VARCHAR is better) |
TEXT |
~65,000 characters | Actual length + 2 bytes | Blog posts, descriptions, long-form content |
MEDIUMTEXT |
~16 million characters | Actual length + 3 bytes | Very long documents, articles |
LONGTEXT |
~4 billion characters | Actual length + 4 bytes | Huge data (rarely needed) |
CHAR vs. VARCHAR
The key difference is how they store data:
-- CHAR — fixed length, padded with spaces
country_code CHAR(2), -- 'US', 'PH', 'JP' — always 2 chars
state_code CHAR(2), -- 'NV', 'CA', 'TX'
md5_hash CHAR(32), -- MD5 hashes are always 32 hex characters
-- VARCHAR — variable length, stores only what's needed
first_name VARCHAR(50), -- 'Ray' stores as 3 + 1 bytes
email VARCHAR(100), -- 'ray@email.com' stores as 13 + 1 bytes
title VARCHAR(200), -- Book titles vary widely in length
✅ When to Use CHAR vs. VARCHAR
Use CHAR when every value has the exact same length (country codes, state abbreviations, UUIDs, hashes). Use VARCHAR for everything else — it's the most common string type by far.
TEXT Types
Use TEXT for long-form content that doesn't need a length limit:
-- TEXT for long content
bio TEXT, -- User biography
description TEXT, -- Product description
article_body MEDIUMTEXT, -- Full blog article
💡 VARCHAR vs. TEXT
Both can store text, but they differ in important ways:
VARCHARcan have aDEFAULTvalue;TEXTcannot (in most cases)VARCHARcan be part of an index fully;TEXTrequires a prefix lengthVARCHARis stored inline with the row; largeTEXTvalues are stored separately
Rule: If you know the maximum length and it's under ~1000 characters, use VARCHAR. If the content is truly unbounded (blog posts, comments, descriptions), use TEXT.
Date & Time Types
MySQL has dedicated types for dates and times — never store them as strings!
| Type | Format | Range | Use Case |
|---|---|---|---|
DATE |
YYYY-MM-DD |
1000-01-01 to 9999-12-31 | Birthdays, publication dates, deadlines |
TIME |
HH:MM:SS |
-838:59:59 to 838:59:59 | Duration, time of day |
DATETIME |
YYYY-MM-DD HH:MM:SS |
1000-01-01 to 9999-12-31 | Order timestamps, event start times |
TIMESTAMP |
YYYY-MM-DD HH:MM:SS |
1970-01-01 to 2038-01-19 | Created/updated timestamps, audit trails |
YEAR |
YYYY |
1901 to 2155 | Publication year, model year |
-- Date and time examples
CREATE TABLE events (
id INT AUTO_INCREMENT PRIMARY KEY,
event_name VARCHAR(100) NOT NULL,
event_date DATE NOT NULL, -- '2026-04-16'
start_time TIME, -- '14:30:00'
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP
);
DATETIME vs. TIMESTAMP
They look the same but behave differently:
| Feature | DATETIME | TIMESTAMP |
|---|---|---|
| Storage | 8 bytes | 4 bytes |
| Range | 1000–9999 | 1970–2038 |
| Time zone | Stores as-is (no conversion) | Converts to UTC on storage, back to local on retrieval |
| Best for | Event dates, birthdates, historical dates | Record creation/modification tracking |
✅ Auto-Tracking Timestamps
A very common pattern is automatically recording when a row was created and last updated:
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP
ON UPDATE CURRENT_TIMESTAMP
created_at is set once when the row is inserted. updated_at is automatically refreshed every time the row changes. You'll see this pattern in almost every production database.
⚠️ Never Store Dates as Strings
It's tempting to use VARCHAR(10) for dates like '04/16/2026', but this breaks sorting, comparisons, and date arithmetic. MySQL can't calculate "30 days from now" on a string. Always use DATE, DATETIME, or TIMESTAMP.
BOOLEAN, ENUM & SET
BOOLEAN
MySQL doesn't have a true boolean type — BOOLEAN is an alias for TINYINT(1):
-- These are equivalent:
is_active BOOLEAN DEFAULT TRUE,
is_active TINYINT(1) DEFAULT 1,
MySQL stores TRUE as 1 and FALSE as 0. You can use either in queries:
-- Both work:
SELECT * FROM books WHERE in_stock = TRUE;
SELECT * FROM books WHERE in_stock = 1;
ENUM — Pick One from a List
ENUM restricts a column to a predefined set of values. Only one value can be chosen:
CREATE TABLE shirts (
id INT AUTO_INCREMENT PRIMARY KEY,
size ENUM('XS', 'S', 'M', 'L', 'XL', 'XXL') NOT NULL,
color ENUM('red', 'blue', 'green', 'black', 'white') NOT NULL
);
-- Valid:
INSERT INTO shirts (size, color) VALUES ('M', 'blue');
-- Invalid — MySQL will reject or warn:
INSERT INTO shirts (size, color) VALUES ('XXXL', 'purple');
💡 ENUM Pros and Cons
Pros: Enforces valid values, very compact storage (1-2 bytes regardless of string length), self-documenting.
Cons: Adding or removing values requires ALTER TABLE, which can be slow on large tables. If the list changes frequently, a lookup table (a separate table with the valid values) is often a better choice.
SET — Pick Multiple from a List
SET is like ENUM but allows multiple values per row:
CREATE TABLE user_preferences (
id INT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(50) NOT NULL,
interests SET('sports', 'music', 'tech', 'gaming', 'cooking')
);
-- One interest:
INSERT INTO user_preferences (username, interests)
VALUES ('alice', 'music');
-- Multiple interests (comma-separated, no spaces):
INSERT INTO user_preferences (username, interests)
VALUES ('bob', 'tech,gaming,music');
📖 In Practice
SET is rarely used in modern databases. For many-to-many relationships like "a user has multiple interests," a junction table (covered in Lesson 11) is the standard and more flexible approach.
Constraints
Constraints are rules enforced by MySQL at the database level. They prevent invalid data from ever entering your tables — acting as a safety net regardless of which application or user is inserting data.
Unique identifier for each row"] A --> C["NOT NULL
Column must have a value"] A --> D["UNIQUE
No duplicate values"] A --> E["DEFAULT
Fallback value if none given"] A --> F["AUTO_INCREMENT
Auto-generates sequential IDs"] A --> G["CHECK
Custom validation rule"] A --> H["FOREIGN KEY
Links to another table"]
PRIMARY KEY
Every table should have a primary key — it uniquely identifies each row. A primary key is automatically NOT NULL and UNIQUE.
-- Column-level (most common for single-column keys)
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
-- Table-level (required for composite keys)
CREATE TABLE order_items (
order_id INT,
product_id INT,
quantity INT NOT NULL,
PRIMARY KEY (order_id, product_id)
);
💡 Composite Primary Keys
A composite primary key uses two or more columns together to identify a row. In the order_items example, neither order_id nor product_id alone is unique — but the combination of both is. This is common in junction tables for many-to-many relationships.
NOT NULL
Prevents a column from accepting NULL (empty/unknown) values. MySQL will reject any insert or update that tries to set a NOT NULL column to NULL.
-- first_name and email are required; city is optional
first_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL,
city VARCHAR(50), -- allows NULL (default behavior)
💡 Tip: Ask yourself: "Does this column always need a value for the row to make sense?" If yes, addNOT NULL. A customer without a name doesn't make sense — sofirst_nameshould beNOT NULL. A customer without a phone number might be fine — sophonecan allow NULL.
UNIQUE
Ensures no two rows have the same value in a column. Unlike PRIMARY KEY, a table can have multiple UNIQUE columns, and UNIQUE columns can contain NULL values (but only one NULL per column).
-- Each customer must have a unique email
email VARCHAR(100) NOT NULL UNIQUE,
-- ISBN numbers are unique across all books
isbn VARCHAR(13) UNIQUE,
-- You can also name the constraint
CONSTRAINT uq_email UNIQUE (email)
DEFAULT
Provides a value when one isn't specified during an INSERT:
in_stock BOOLEAN DEFAULT TRUE,
role VARCHAR(20) DEFAULT 'customer',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
quantity INT DEFAULT 0,
-- If we insert without specifying 'in_stock':
INSERT INTO books (title, author, price)
VALUES ('Dune', 'Frank Herbert', 12.99);
-- in_stock will automatically be TRUE (1)
AUTO_INCREMENT
Automatically generates sequential integer values (1, 2, 3, ...). Used almost exclusively for primary keys:
id INT AUTO_INCREMENT PRIMARY KEY
Key behaviors:
- You don't include
idin yourINSERT— MySQL assigns it automatically - If you delete row 5, the number 5 is not reused — the next insert gets 6
- You can check the current value:
SELECT LAST_INSERT_ID(); - Only one
AUTO_INCREMENTcolumn per table, and it must be indexed (usually asPRIMARY KEY)
CHECK (MySQL 8.0.16+)
Enforces a custom condition on column values:
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price DECIMAL(8, 2) NOT NULL,
quantity INT NOT NULL DEFAULT 0,
-- Prices must be positive
CONSTRAINT chk_price CHECK (price > 0),
-- Quantity can't be negative
CONSTRAINT chk_quantity CHECK (quantity >= 0)
);
-- This will FAIL:
INSERT INTO products (name, price, quantity)
VALUES ('Widget', -5.00, 10);
-- ERROR: Check constraint 'chk_price' is violated.
-- More CHECK examples
age TINYINT UNSIGNED CHECK (age >= 0 AND age <= 150),
rating DECIMAL(3,2) CHECK (rating >= 0 AND rating <= 5),
status VARCHAR(20) CHECK (status IN ('active', 'inactive', 'pending')),
⚠️ CHECK Constraint History
Before MySQL 8.0.16, CHECK constraints were accepted in SQL syntax but silently ignored. If you're using MySQL 8.0.16 or later (which you should be), they work as expected. Always verify your MySQL version: SELECT VERSION();
FOREIGN KEY (Preview)
Foreign keys link a column in one table to the primary key of another table. This enforces referential integrity — you can't add an order for a customer that doesn't exist. We'll cover foreign keys in depth in Lesson 12.
-- Quick preview — full explanation in Lesson 12
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
Constraints Quick Reference
| Constraint | What It Enforces | Can a Table Have Multiple? |
|---|---|---|
PRIMARY KEY |
Uniquely identifies each row (NOT NULL + UNIQUE) | No — one per table |
NOT NULL |
Column must always have a value | Yes |
UNIQUE |
No duplicate values in the column | Yes |
DEFAULT |
Provides a fallback value on INSERT | Yes |
AUTO_INCREMENT |
Auto-generates sequential integer values | No — one per table |
CHECK |
Validates against a custom condition | Yes |
FOREIGN KEY |
References a row in another table | Yes |
Choosing the Right Type — A Practical Guide
When designing a table, use this flowchart to pick the right data type:
based on range needed"] B -->|"Decimal / Money"| G["DECIMAL(M,D)
NEVER use FLOAT for money"] C -->|"Fixed length ≤255"| H["CHAR(N)
Codes, hashes, abbreviations"] C -->|"Variable length ≤65K"| I["VARCHAR(N)
Names, emails, titles"] C -->|"Long text"| J["TEXT or MEDIUMTEXT
Articles, descriptions"] D -->|"Date only"| K["DATE"] D -->|"Date + time"| L["DATETIME or TIMESTAMP"] D -->|"Time only"| M["TIME"] E --> N["BOOLEAN
(stored as TINYINT)"]
Common Column Patterns
Here are types you'll use again and again in real projects:
| Column Purpose | Recommended Type | Example |
|---|---|---|
| Primary key | INT AUTO_INCREMENT PRIMARY KEY |
id INT AUTO_INCREMENT PRIMARY KEY |
| Person's name | VARCHAR(50-100) |
first_name VARCHAR(50) NOT NULL |
| Email address | VARCHAR(100) NOT NULL UNIQUE |
email VARCHAR(100) NOT NULL UNIQUE |
| Phone number | VARCHAR(20) |
phone VARCHAR(20) |
| Money / price | DECIMAL(10, 2) |
price DECIMAL(10, 2) NOT NULL |
| Age | TINYINT UNSIGNED |
age TINYINT UNSIGNED |
| Country code | CHAR(2) |
country_code CHAR(2) |
| URL | VARCHAR(2083) |
website VARCHAR(2083) |
| Blog post body | TEXT or MEDIUMTEXT |
body TEXT NOT NULL |
| Active/inactive flag | BOOLEAN DEFAULT TRUE |
is_active BOOLEAN DEFAULT TRUE |
| Creation timestamp | DATETIME DEFAULT CURRENT_TIMESTAMP |
created_at DATETIME DEFAULT CURRENT_TIMESTAMP |
| Last-modified timestamp | DATETIME ... ON UPDATE |
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP |
💡 Why VARCHAR(20) for Phone Numbers?
Phone numbers look like numbers, but they're actually text! They can start with a 0 or +, contain dashes and spaces, and you never do math on them. '+1 (702) 555-0123' is a string, not a number. Always store phone numbers as VARCHAR.
Exercises
🏋️ Exercise 1: Pick the Right Type
For each piece of data below, choose the most appropriate MySQL data type and explain why:
- A customer's ZIP code (US format: "89002" or "89002-1234")
- The price of a product
- The number of items in stock
- A user's biography
- Whether an account is active or suspended
- A country's two-letter ISO code (like "US" or "PH")
- A student's GPA (0.00 to 4.00)
- The date a book was published
✅ Solution
VARCHAR(10)— ZIP codes can have leading zeros and hyphens; they're text, not numbersDECIMAL(10, 2) NOT NULL— exact precision for money; never FLOATINT UNSIGNED DEFAULT 0— whole number, can't be negativeTEXT— biographies can be long and have no fixed upper limitBOOLEAN DEFAULT TRUE— simple true/false toggleCHAR(2) NOT NULL— always exactly 2 characters, CHAR is perfectDECIMAL(3, 2)— holds 0.00 to 9.99, perfect for GPADATE— date only, no time needed
🏋️ Exercise 2: Build a Table with Constraints
Create an employees table inside the bookstore database with the following requirements:
id— auto-incrementing primary keyfirst_name— up to 50 characters, requiredlast_name— up to 50 characters, requiredemail— up to 100 characters, required, must be uniquehire_date— date only, requiredsalary— up to 999,999.99, required, must be positivedepartment— one of: 'sales', 'inventory', 'management', 'support'is_active— boolean, defaults to truecreated_at— auto-set to current timestamp
✅ Solution
USE bookstore;
CREATE TABLE IF NOT EXISTS employees (
id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL UNIQUE,
hire_date DATE NOT NULL,
salary DECIMAL(8, 2) NOT NULL,
department ENUM('sales', 'inventory', 'management', 'support'),
is_active BOOLEAN DEFAULT TRUE,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT chk_salary CHECK (salary > 0)
);
DESCRIBE employees;
🏋️ Exercise 3: Fix the Bad Table
This table has several design problems. Identify and fix them:
CREATE TABLE Orders (
OrderID VARCHAR(10),
CustomerName VARCHAR(500),
Price FLOAT,
OrderDate VARCHAR(20),
phone INT,
Status VARCHAR(100)
);
💡 Hint
Look for: naming conventions, missing primary key, wrong data types (money, dates, phone numbers), missing constraints, oversized columns.
✅ Solution
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
customer_name VARCHAR(100) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
order_date DATETIME DEFAULT CURRENT_TIMESTAMP,
phone VARCHAR(20),
status ENUM('pending', 'processing', 'shipped', 'delivered', 'cancelled')
DEFAULT 'pending',
CONSTRAINT chk_price CHECK (price > 0)
);
Issues fixed:
- Table and column names → lowercase snake_case
- Added
INT AUTO_INCREMENT PRIMARY KEYinstead ofVARCHAR(10) - Reduced
customer_namefrom 500 to a reasonable 100 chars FLOAT→DECIMAL(10, 2)for moneyVARCHAR(20)→DATETIMEfor datesINT→VARCHAR(20)for phone numbersVARCHAR(100)→ENUMfor status (or VARCHAR with CHECK)- Added
NOT NULLandCHECKconstraints
🎯 Quick Quiz
Question 1: Which data type should you use for storing prices?
Question 2: What does UNSIGNED do to an integer type?
Question 3: Why should phone numbers be stored as VARCHAR instead of INT?
Question 4: When should you use CHAR instead of VARCHAR?
Question 5: What constraint automatically sets a column to the next sequential number?
Summary
🎉 Key Takeaways
- Numeric types: Use
INTfor most whole numbers,TINYINT/SMALLINT/BIGINTwhen you need a specific range, andDECIMALfor exact decimals (especially money) - Never use FLOAT for money — it produces rounding errors
- String types:
VARCHARfor most text,CHARfor fixed-length values,TEXTfor long-form content - Date types:
DATEfor dates,DATETIMEfor timestamps,TIMESTAMPfor auto-tracking. Never store dates as strings - BOOLEAN is really
TINYINT(1)—TRUE= 1,FALSE= 0 - ENUM restricts a column to specific values; good for stable, small lists
- Constraints enforce data integrity:
PRIMARY KEY,NOT NULL,UNIQUE,DEFAULT,AUTO_INCREMENT,CHECK UNSIGNEDremoves negative numbers and doubles the positive range- Always pick the smallest type that comfortably holds your data
- Phone numbers, ZIP codes, and other "numeric-looking" identifiers should be stored as
VARCHAR
📚 Additional Resources
- MySQL Docs — Data Types
- MySQL Docs — Integer Types
- MySQL Docs — DECIMAL and NUMERIC
- MySQL Docs — String Types
- MySQL Docs — CHECK Constraints
🚀 What's Next?
Now that you know what types of data go into tables and how to protect them with constraints, it's time to put data in and get it back out. In the next lesson, you'll learn the four fundamental operations every database developer uses daily — INSERT, SELECT, UPDATE, and DELETE — collectively known as CRUD.
🎉 Congratulations!
You now know how to choose the right data type for any column and enforce rules with constraints. This knowledge will guide every table you design from here on out!