Module 2: Aggregating & Grouping Data40 min

GROUP BY Fundamentals

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

One Number Per Group

In the last lesson, you got a single total for the whole table. Useful, but limited. Real business questions usually need breakdowns: revenue by month, customers by region, headcount by department.

GROUP BY is how you get there. It splits your data into groups, then applies your aggregate function to each group separately.

Your First GROUP BY

SQL
SELECT
  cust_id,
  COUNT(*) AS order_count
FROM orders
GROUP BY cust_id;
Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
1
Count Employees per Department

Count the number of employees in each department from `techcorp_workforce`.

Tables: techcorp_workforce

Instead of one count for everyone, you get one count per department. The database groups all HR employees, counts them, and then does the same for Admin, Account, and so on.

The Golden Rule

Here’s something many beginners find confusing: when you use GROUP BY, every column in your SELECT must either be in your GROUP BY clause or inside an aggregate function. No exceptions.

SQL
-- This works: department is in GROUP BY
SELECT
  department,
  AVG(salary)
FROM techcorp_workforce
GROUP BY department;
SQL
-- This fails: first_name isn’t grouped or aggregated
SELECT
  department,
  first_name,
  AVG(salary)
FROM techcorp_workforce
GROUP BY department;

Why? Think about it: if you’re grouping by department, and HR has four employees, which first name should SQL show? It can’t pick one, so it refuses to run.

Most common GROUP BY error

If your query fails, check that every non-aggregated column is in your GROUP BY.

Sorting Your Groups

GROUP BY results come back in no particular order. Usually, you want them sorted — either alphabetically or by the aggregate value:

SQL
-- Sort by department name
SELECT
  department,
  SUM(salary) AS dept_payroll
FROM techcorp_workforce
GROUP BY department
ORDER BY department;

MacBookPro User Event Count

Group by event name and use COUNT.

Table: playbook_events
user_idoccurred_atevent_typeevent_namelocationdevice
69912014-06-09 18:26:54engagementhome_pageUnited Statesiphone 5
188512014-08-29 13:18:38signup_flowenter_infoRussiaasus chromebook
149982014-07-01 12:47:56engagementloginFrancehp pavilion desktop
81862014-05-23 10:44:16engagementhome_pageItalymacbook pro
96262014-07-31 17:15:14engagementloginRussianexus 7
2
MacBookPro User Event Count
View solution

Count the number of user events performed by MacBookPro users. Output the result along with the event name. Sort the result based on the event count in the descending order.

Tables: playbook_events

Artist Appearance Count

Group by artist and count how many times they appear.

Table: spotify_worldwide_daily_song_ranking
idpositiontracknameartiststreamsurlstream_dateregion
30365152Heart Won't ForgetMatoma28047https://open.spotify.com/track/2of2DM5LqTh7ohmmVXUKsH2017-02-04no
85559160Someone In The Crowd - From "La La Land" SoundtrackEmma Stone17134https://open.spotify.com/track/7xE4vKvjqUTtHyJ9zi0k1q2017-02-26fr
1046089175The GreatestSia10060https://open.spotify.com/track/7xHWNBFm6ObGEQPaUxHuKO2017-03-06cl
35082425UnforgettableFrench Montana46603https://open.spotify.com/track/3B54sVLJ402zGa6Xm4YGNe2017-10-01no
7768221Bad and Boujee (feat. Lil Uzi Vert)Migos1823391https://open.spotify.com/track/4Km5HrUvYTaSUfiSGPJeQR2017-01-27us
3
Artist Appearance Count
View solution

Find how many times each artist appeared on the Spotify ranking list. Output the artist name along with the corresponding number of occurrences. Order records by the number of occurrences in descending order.

Tables: spotify_worldwide_daily_song_ranking

Top Ranked Songs

You’ll need GROUP BY with a WHERE filter here.

Table: spotify_worldwide_daily_song_ranking
idpositiontracknameartiststreamsurlstream_dateregion
30365152Heart Won't ForgetMatoma28047https://open.spotify.com/track/2of2DM5LqTh7ohmmVXUKsH2017-02-04no
85559160Someone In The Crowd - From "La La Land" SoundtrackEmma Stone17134https://open.spotify.com/track/7xE4vKvjqUTtHyJ9zi0k1q2017-02-26fr
1046089175The GreatestSia10060https://open.spotify.com/track/7xHWNBFm6ObGEQPaUxHuKO2017-03-06cl
35082425UnforgettableFrench Montana46603https://open.spotify.com/track/3B54sVLJ402zGa6Xm4YGNe2017-10-01no
7768221Bad and Boujee (feat. Lil Uzi Vert)Migos1823391https://open.spotify.com/track/4Km5HrUvYTaSUfiSGPJeQR2017-01-27us
4
Top Ranked Songs
View solution

Find songs that have ranked in the top position. Output the track name and the number of times it ranked at the top. Sort your records by the number of times the song was in the top position in descending order.

Tables: spotify_worldwide_daily_song_ranking

Key Takeaways

  • GROUP BY splits your data into groups before aggregating
  • Every SELECT column must be in GROUP BY or an aggregate function
  • You can group by multiple columns for finer breakdowns
  • Use ORDER BY to sort your results (it comes after GROUP BY)

What’s Next

GROUP BY creates your groups, but what if you only want some of them? Say, only departments with more than five employees, or only months with revenue over $100K. You can’t use WHERE for this — WHERE filters rows before grouping happens. You need something that filters after. That’s HAVING, and it’s next.