CROSS JOIN
Progress Tracking
Log in to save this lesson and continue from where you left off.
SELECT
a.number AS num_a,
b.number AS num_b
FROM transportation_numbers a
CROSS JOIN transportation_numbers b
WHERE a.index <= 3
AND b.index <= 3
ORDER BY a.number, b.number;With 3 rows on each side, you get 3 × 3 = 9 rows. Every number paired with every number.
CROSS JOIN can destroy your query performance. 1,000 rows × 1,000 rows = 1 million rows. 10,000 × 10,000 = 100 million.
Legitimate Use Cases
CROSS JOIN is rare, but when you need it, nothing else works:
1. Building a Date Scaffold
Say you need a report showing every product’s sales for every day, even days with zero sales. You can’t aggregate what isn’t there. So you build a scaffold:
-- Create a row for every date x product combination
SELECT
d.date,
p.product_name
FROM dates d
CROSS JOIN products p
ORDER BY
d.date,
p.product_name;
-- Then LEFT JOIN your actual sales data to this scaffold
-- Days with no sales show up as NULL (or 0 after COALESCE)This is a common pattern in reporting when you need to show zeros explicitly rather than missing rows.
2. Generating All Attribute Combinations
E-commerce catalog: you have sizes (S, M, L, XL) and colors (Red, Blue, Black). You need a SKU for every combination:
SELECT
s.size,
c.color,
CONCAT(s.size_code, '-', c.color_code) AS sku
FROM sizes s
CROSS JOIN colors c;Accidental Cartesian Products
More commonly, you’ll create a Cartesian product by accident. Here’s how it happens:
The Old Comma Syntax
Before ANSI SQL standardized JOIN syntax, people wrote joins like this:
-- Old style: comma-separated tables
SELECT *
FROM orders, customers
WHERE orders.cust_id = customers.id;This works fine. But if you forget the WHERE clause:
-- DISASTER: No WHERE clause = Cartesian product
SELECT *
FROM orders, customers;
-- Returns orders x customers rowsThis is why modern SQL style uses explicit JOIN... ON syntax. It’s harder to accidentally create a Cartesian product when the join condition is required.
Wrong Join Predicates
Even with proper JOIN syntax, a wrong predicate can explode your results:
-- Intended: match orders to customers by ID
SELECT *
FROM orders o
JOIN customers c
ON o.order_date = c.signup_date; -- WRONG!
-- This joins on dates, not customer relationship
-- Multiple orders and customers share dates = row explosionAlways double-check that you’re joining on the actual relationship (foreign key to primary key), not just columns that happen to have compatible types.
Diagnosing Row Explosion
When your query returns way more rows than expected, here’s how to debug.
| id | first_name | last_name | city | address | phone_number |
|---|---|---|---|---|---|
| 8 | John | Joseph | San Francisco | 928-386-8164 | |
| 7 | Jill | Michael | Austin | 813-297-0692 | |
| 4 | William | Daniel | Denver | 813-368-1200 | |
| 5 | Henry | Jackson | Miami | 808-601-7513 | |
| 13 | Emma | Isaac | Miami | 808-690-5201 |
| id | cust_id | order_date | order_details | total_order_cost |
|---|---|---|---|---|
| 1 | 3 | 2019-03-04 | Coat | 100 |
| 2 | 3 | 2019-03-01 | Shoes | 80 |
| 3 | 3 | 2019-03-07 | Skirt | 30 |
| 4 | 7 | 2019-02-01 | Coat | 25 |
| 5 | 7 | 2019-03-10 | Shoes | 80 |
Write a query using UNION ALL to check the row count of each source table, then compare with the JOIN result.
An INNER JOIN should return roughly the same number of rows as your fact table (orders), or fewer. If it returns 10x more, something’s wrong with your join logic.
Maximum of Two Numbers
| number |
|---|
| -2 |
| -1 |
| 0 |
| 1 |
| 2 |
Given a single column of numbers, consider all possible permutations of two numbers with replacement, assuming that pairs of numbers (x,y) and (y,x) are two different permutations. Then, for each permutation, find the maximum of the two numbers. Output three columns: the first number, the second number and the maximum of the two.
Key Takeaways
CROSS JOINintentionally produces all combinations (noONclause)- Result size = rows in table1 × rows in table2 (can be huge)
- Use it for: date scaffolds, attribute combinations, comparison matrices
- Accidental cross joins come from: comma syntax without
WHERE, wrong join predicates - Always sanity-check row counts when results seem off
What’s Next
Sometimes you need to join a table to itself. Finding employee-manager pairs, comparing rows within the same dataset. That’s the self-join, and it’s more useful than it might sound.