Module 2: Aggregating & Grouping Data•35 min

Combining Aggregation Techniques

Progress Tracking

Putting It All Together

You’ve learned the pieces: aggregate functions, GROUP BY, HAVING, NULL handling, conditional aggregation. Real analysis usually requires combining several of these in one query.

Let’s walk through building a complex query step by step.

Building Queries Incrementally

Here’s a business question: Find departments with more than 2 employees hired after 2022, showing their headcount and average salary, sorted by average salary.

Don’t need to write it all at once. Build it piece by piece, testing after each step.

Step 1: Headcount by Department

Table: techcorp_workforce

id	first_name	last_name	department	salary	phone_number	joining_date
1	Sarah	Mitchell	HR	95000	555-0101	2021-03-15
2	Michael	Chen	HR	88000	555-0102	2022-06-01
3	Emily	Rodriguez	HR	82500		2021-09-20
4	David	Park	HR	80000	555-0104	2023-01-10
5	Lisa	Thompson	HR	65000		2021-04-05

Tables: techcorp_workforce

Run it. Check the output makes sense before moving to the next step.

Step 2: Only Hires After 2022

Now add the row-level filter.

Oracle doesn’t implicitly cast string literals to dates — use a DATE literal or TO_DATE(). The other three dialects accept the plain string format directly.

Tables: techcorp_workforce

Now you’re only counting recent hires. Check that the numbers changed.

Step 3: Departments with More Than 2 Such Hires

Tables: techcorp_workforce

Only departments meeting the threshold show up now.

Step 4: Add Salary Stats and Sort

Tables: techcorp_workforce

Now you’ve answered the full business question. Each step was testable on its own.

Build incrementally

Always build queries incrementally. Run after each change. It’s much easier to fix a bug in the step you just added than to debug a 20-line query that doesn’t work.

Combining GROUP BY with Conditional Aggregation

You can use CASE WHEN inside grouped queries for detailed breakdowns:

SQL

SELECT
  department,
  COUNT(*) AS total,
  SUM(CASE WHEN salary < 80000 THEN 1 ELSE 0 END) AS under_80k,
  SUM(CASE WHEN salary >= 80000 THEN 1 ELSE 0 END) AS over_80k,
  ROUND(AVG(salary), 0) AS avg_salary
FROM techcorp_workforce
GROUP BY department
ORDER BY total DESC;

Each department row now shows the salary distribution within that department.

Table: playbook_users

user_id	created_at	company_id	language	activated_at	state
11	2013-01-01 04:41:13	1	german	2013-01-01	active
52	2013-01-05 15:30:45	2866	spanish	2013-01-05	active
52	2013-01-05 15:30:45	2866	german	2013-01-05	active
108	2013-01-10 11:04:58	1848	spanish	2013-01-10	active
167	2013-01-16 20:40:24	6709	arabic	2013-01-16	active

Tables: playbook_users

The Query Building Checklist

When writing complex aggregation queries, follow this order:

Start with SELECT and FROM
Add WHERE if you need to filter individual rows
Add GROUP BY with the columns you want to segment by
Add HAVING if you need to filter based on aggregate values
Add ORDER BY to sort your results
Test after each step

Top 10 Songs 2010

Filter by year with WHERE, group appropriately, then use ORDER BY and LIMIT.

Table: billboard_top_100_year_end

year	year_rank	group_name	artist	song_name	id
1956	1	Elvis Presley	Elvis Presley	Heartbreak Hotel	1
1956	2	Elvis Presley	Elvis Presley	Don't Be Cruel	2
1956	3	Nelson Riddle	Nelson Riddle	Lisbon Antigua	3
1956	4	Platters	Platters	My Prayer	4
1956	5	Gogi Grant	Gogi Grant	The Wayward Wind	5

Tables: billboard_top_100_year_end

Department Workforce Analysis

Filter with WHERE before grouping, then use HAVING to filter groups.

Table: techcorp_workforce

id	first_name	last_name	department	salary	phone_number	joining_date
1	Sarah	Mitchell	HR	95000	555-0101	2021-03-15
2	Michael	Chen	HR	88000	555-0102	2022-06-01
3	Emily	Rodriguez	HR	82500		2021-09-20
4	David	Park	HR	80000	555-0104	2023-01-10
5	Lisa	Thompson	HR	65000		2021-04-05

Tables: techcorp_workforce

Key Takeaways

You now have a complete toolkit for summarizing and analyzing data:

COUNT, SUM, AVG, MIN, MAX for basic aggregation
GROUP BY to segment your data
HAVING to filter aggregated results
Understanding of how NULLs behave in aggregates
CASE WHEN for conditional aggregation and pivots
The ability to combine all these techniques

What’s Next

So far, all your queries have worked with a single table. But real databases have data spread across multiple tables: customers in one, orders in another, products in a third. Module 3 introduces JOINs, which let you combine data from multiple tables. That’s when things get really powerful.

Next upContinue →

Understanding Table Relationships

Module 3: Working with Multiple Tables

15 min