Module 3: Working with Multiple Tables35 min

Joining Multiple Tables

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in
SQL
SELECT
  e.first_name,
  e.last_name,
  p.title AS project_title,
  p.budget
FROM linkedin_employees e
JOIN linkedin_emp_projects ep
  ON e.id = ep.emp_id
JOIN linkedin_projects p
  ON ep.project_id = p.id;

Plan Before You Write

When facing a complex request, don’t start typing immediately. Sketch out the path first:

  • What data is needed? (columns from which tables)
  • How do the tables connect? (foreign key relationships)
  • What’s the logical path? (A → B → C → D)

Drawing boxes with arrows on paper sounds old-school, but it catches mistakes before they’re written.

Follow the chain

Each JOIN must connect to something already in your query. You can’t jump from customers directly to categories if they’re only linked through orders and products.

Join Order: When It Matters

For INNER JOINs, the order doesn’t matter. The optimizer rearranges them for performance. But OUTER JOINs are different: the order affects which rows survive.

SQL
-- All customers, their orders (if any), returns on those orders (if any)
SELECT
  c.name,
  o.id AS order_id,
  r.reason
FROM customers c
LEFT JOIN orders o
  ON c.id = o.customer_id
LEFT JOIN returns r
  ON o.id = r.order_id;
-- Customer with no orders: NULLs for both order and return
-- Customer with order but no return: order data, NULL return
-- Customer with order and return: all data

Each LEFT JOIN preserves rows from the previous result. Customers without orders still appear. Orders without returns still appear.

INNER JOIN can eliminate preserved rows

When mixing INNER and OUTER joins, think carefully about which rows are preserved at each step. An INNER JOIN in the middle can eliminate rows you wanted to keep.

Mixing Join Types

Different JOIN types can be mixed in the same query.

SQL
-- Only customers with orders, but show all returns for those orders
SELECT
  c.name,
  o.id,
  r.return_reason
FROM customers c
INNER JOIN orders o
  ON c.id = o.customer_id
LEFT JOIN returns r
  ON o.id = r.order_id;

The INNER JOIN filters to customers with orders. The LEFT JOIN then preserves all those orders, even ones without returns. If you changed INNER JOIN to LEFT JOIN, customers without any orders would also appear.

The Row Multiplication Problem

Here’s a trap that gets everyone at least once. You join tables, your aggregates are wrong, and you spend an hour figuring out why.

The culprit: many-to-many relationships multiplying rows. Say you join customers to orders (fine so far), then join orders to a tags table where each order can have multiple tags. Suddenly each order appears three times — once per tag — and your SUM(revenue) is 3x too high.

Many-to-many multiplies rows

If an order can have multiple tags, categories, or other attributes, joining that table multiplies your fact rows. Your SUM() will be wrong.

How to catch this — check row counts at each step to find where the explosion happens:

Table: customers
idfirst_namelast_namecityaddressphone_number
8JohnJosephSan Francisco928-386-8164
7JillMichaelAustin813-297-0692
4WilliamDanielDenver813-368-1200
5HenryJacksonMiami808-601-7513
13EmmaIsaacMiami808-690-5201
Table: orders
idcust_idorder_dateorder_detailstotal_order_cost
132019-03-04Coat100
232019-03-01Shoes80
332019-03-07Skirt30
472019-02-01Coat25
572019-03-10Shoes80
1
Debug Row Multiplication

Write a query using UNION ALL to check the row count of each source table, then compare with the JOIN result.

Tables: customers, orders

Common Mistakes

  • Wrong join path — you can’t join products directly to customers if they’re only connected through orders. Follow the foreign key chain.
  • Forgetting that dimension tables can have duplicates — if your customers table has multiple rows per customer (slowly changing dimensions, data quality issues), every join against that table multiplies your rows. Always check for duplicates with GROUP BY id HAVING COUNT(*) > 1 before trusting aggregate results.
  • Mixing OUTER joins without thinking through preservation — LEFT JOIN then INNER JOIN often defeats the LEFT JOIN. Be intentional.
  • Illegible queries without aliases — with 5+ tables, aliases aren’t optional. Pick meaningful ones and use them consistently.

Salary Less Than Twice The Average

Join three tables to compare manager salaries with their team averages.

Table: map_employee_hierarchy
empl_idmanager_empl_id
E849
E850E849
E851E849
E852E850
E853E850
Table: dim_employee
empl_idempl_nameempl_cityempl_dobempl_pinsalary
E849Steven M. JonesHicksville1988-03-29149080000
E850Marilynn M. WaltersNew York1978-12-26962430000
E851Kyle M. MasseyLake Katrine1977-09-22156340000
E852Cody A. MosbyAnaheim1965-03-18488322000
E853David J. MintzHouston1977-01-04800118000
2
Salary Less Than Twice The Average
View solution

Write a query to get the list of managers whose salary is less than twice the average salary of employees reporting to them. For these managers, output their ID, salary and the average salary of employees reporting to them.

Tables: map_employee_hierarchy, dim_employee

Key Takeaways

  • Chain JOINs to connect any number of tables
  • Plan your path before writing — sketch the table relationships
  • JOIN order matters for OUTER joins, not for INNER joins
  • Watch for row multiplication from many-to-many relationships
  • Check row counts at each step when debugging

What’s Next

Congratulations — you’ve completed Module 3. You can now combine data from any number of tables using UNION and all types of JOINs. These skills show up in almost every real-world SQL task. In the next module, you’ll learn subqueries and CTEs, which let you build complex logic step by step.