Module 4: Subqueries & CTEs40 min

CTE Patterns for Common Problems

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

Ratios and Percentages

Calculating ratios is one of the most common analytical tasks. It’s also where queries silently break if you’re not careful.

The NULLIF Pattern

Division by zero crashes your query. NULLIF prevents this:

SQL
-- Without protection: crashes if denominator is 0
SELECT numerator / denominator AS ratio;

-- With NULLIF: returns NULL instead of crashing
SELECT numerator / NULLIF(denominator, 0) AS ratio;

NULLIF(x, 0) returns NULL when x equals 0, and returns x otherwise. NULL divided by anything is NULL, so your query continues instead of failing.

Two habits for percentages

The 100.0 (not 100) forces decimal division. Without the decimal, you get integer division and lose precision. NULLIF prevents divide-by-zero. Get in the habit of using both for percentages.

The Full Pattern

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
1
Try the Full Ratio Pattern

Calculate what percentage of the company each department represents by headcount. Use NULLIF for safety and 100.0 for decimal division.

Tables: employee

Processed Ticket Rate By Type

Use CTEs to calculate the numerator and denominator separately, then combine safely.

Table: facebook_complaints
complaint_idtypeprocessed
00TRUE
10TRUE
20FALSE
31TRUE
41TRUE
2
Processed Ticket Rate By Type
View solution

Find the processed rate of tickets for each `type`. The processed rate is defined as the number of processed tickets divided by the total number of tickets for that type. Round this result to two decimal places.

Tables: facebook_complaints
Always use NULLIF for ratios

Rates and percentages always need NULLIF(denominator, 0). Make it a habit.

The Pre-Aggregate-and-Join Pattern

Remember correlated subqueries from Lesson 4.1? They run once per row, which can be slow. Here’s the faster alternative.

The Problem: Correlated Subquery

SQL
-- Slow: runs the AVG calculation for every single row
SELECT
  e1.first_name,
  e1.salary,
  e1.department
FROM employees e1
WHERE e1.salary > (
  SELECT AVG(e2.salary)
  FROM employees e2
  WHERE e2.department = e1.department
);

For a table with 10,000 employees across 50 departments, that’s 10,000 separate AVG calculations.

The Solution: Pre-Aggregate with CTE

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
3
Try Pre-Aggregate-and-Join

Use a CTE to calculate the average salary per department, then JOIN it back to show each employee alongside their department average. Filter to above-average earners.

Tables: employee

Now we calculate 50 averages (one per department), not 10,000. Same result, dramatically faster.

Highest Target Under Manager

Use a CTE to pre-aggregate, then filter based on the aggregated values.

Table: salesforce_employees
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
1AllenWang55FManagerManagement200000600300Allen@company.comCalifornia1069 Ventura Drive1
20SarrahBicky31FSenior SalesSales2000500150Sarrah@company.comFlorida1176 Tyler Avenue19
7NickyBat33FSalesSales1400400100Molly@company.comArizona3461 Preston Street13
31SteveSmith39MSalesSales1500400125steve@company.comTexas13
32DavidWarner38MSalesSales1200400150david@company.comFlorida13
4
Highest Target Under Manager
View solution

Identify the employee(s) working under manager `manager_id=13` who have achieved the highest target. Return each such employee’s first name alongside the target value. The goal is to display the maximum target among all employees under `manager_id=13` and show which employee(s) reached that top value.

Tables: salesforce_employees

Replacing Self-Referencing Queries

When you need to compare rows within the same table (like employees to their managers), the CTE approach is cleaner.

The Correlated Way

SQL
-- Find employees earning more than their manager
SELECT
  e.first_name,
  e.salary
FROM employee e
WHERE e.salary > (
  SELECT m.salary
  FROM employee m
  WHERE m.id = e.manager_id
);

The CTE Way (Clearer and Often Faster)

SQL
WITH manager_salaries AS (
  SELECT
    id,
    salary AS manager_salary
  FROM employee
  WHERE id IN (
    SELECT DISTINCT manager_id
    FROM employee
  )
)
SELECT
  e.first_name,
  e.salary
FROM employee e
JOIN manager_salaries m
  ON e.manager_id = m.id
WHERE e.salary > m.manager_salary;

Or even simpler with a self-join (from Module 3):

SQL
SELECT
  e.first_name,
  e.salary
FROM employee e
JOIN employee m
  ON e.manager_id = m.id
WHERE e.salary > m.salary;

Employee and Manager Salaries

Compare employees to their managers using the approach you find clearest.

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
5
Employee and Manager Salaries
View solution

Find employees who are earning more than their managers. Output the employee's first name along with the corresponding salary.

Tables: employee

When Subqueries Still Make Sense

CTEs are preferred, but simple subqueries have their place:

  • Simple scalar lookups: WHERE x = (SELECT MAX(x)...) is perfectly readable
  • Quick EXISTS checks: sometimes clearer than a CTE plus JOIN
  • One-off filters: when a CTE would be overkill
  • Reading legacy code: you’ll encounter subqueries constantly

The goal isn’t to eliminate subqueries. It’s about choosing the right tool for each situation. When queries get complex, CTEs are almost always the better choice.

Dialect Notes

CTEs work across all major SQL dialects with minor variations:

  • PostgreSQL, MySQL 8.0+, SQL Server, Oracle: all support CTEs with WITH
  • Recursive CTEs use WITH RECURSIVE in PostgreSQL/MySQL, just WITH in SQL Server/Oracle
  • Recursive CTEs are advanced content that will be covered in Premium Learning Paths
Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
6
Try Salary Percentages with Filter

Calculate each department salary as a percentage of the total. Show only departments over 20%.

Tables: employee

Key Takeaways

  • NULLIF(x, 0) prevents divide-by-zero errors; use it for all ratios
  • Use 100.0 (not 100) to force decimal division
  • Pre-aggregate-and-join beats correlated subqueries for performance
  • Calculate once in a CTE instead of once per row
  • Simple subqueries still have their place; choose the clearest approach
  • Debug by running each CTE independently

What’s Next

Congratulations! You’ve completed Module 4: Subqueries and CTEs. You now have powerful tools for breaking complex problems into manageable steps, comparing rows to aggregates, and writing queries that are actually readable six months later.

In the next module, you’ll tackle Window Functions. They’re asked in many data analyst interviews and let you do things that would be incredibly complex with CTEs alone: running totals, rankings, and period-over-period comparisons without self-joins.