Module 2: Aggregating & Grouping Data35 min

Introduction to Aggregate Functions

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

Beyond Individual Rows

In Module 1, you learned to retrieve and filter individual rows. That’s useful, but it’s not how most business questions are phrased.

Nobody asks, “Show me row 47 from the orders table.” They ask:

  • How many customers do we have?
  • What’s our total revenue this quarter?
  • What’s the average order value?

These questions need you to crunch multiple rows down into a single answer.

That’s what aggregate functions do. They take a pile of rows and collapse them into one summary value.

The Five Functions You’ll Use Constantly

SQL gives you five core aggregate functions. You’ll use these in almost every analysis you do.

COUNT: How Many?

COUNT is probably the one you’ll reach for most. It tells you how many rows match your criteria.

SQL
SELECT COUNT(*)
FROM techcorp_workforce;

That’s it. One number back: the total rows in the table.

But COUNT has three flavors, and knowing the difference matters:

SQL
-- Count all rows (including those with NULLs)
SELECT COUNT(*)
FROM techcorp_workforce;

-- Count rows where department isn’t NULL
SELECT COUNT(department)
FROM techcorp_workforce;

-- Count unique departments
SELECT COUNT(DISTINCT department)
FROM techcorp_workforce;

Rows vs Entities: Why DISTINCT Matters

In real data, you often have multiple rows per entity. Say your orders table has 5 rows for customer 101 (five separate orders). COUNT(*) gives you 5 (the number of orders), but COUNT(DISTINCT customer_id) gives you 1 (the number of customers).

Always ask yourself: Am I counting rows or entities? Getting this wrong is one of the most common sources of incorrect numbers in reports.

SUM: Add It Up

SUM does exactly what you’d expect.

Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
1
Try SUM

Calculate the total payroll by summing all salaries. Alias it as total_payroll.

Tables: techcorp_workforce

One thing to know: SUM ignores NULLs. If you have salaries of 50000, 60000, and NULL, you get 110000, not an error.

AVG: The Average

AVG calculates the mean:

SQL
SELECT AVG(salary) AS average_salary
FROM techcorp_workforce;
AVG ignores NULLs

AVG also ignores NULLs. If you have values 100, 200, and NULL, the average is 150 (300 divided by 2), not 100 (300 divided by 3). This is usually what you want, but be aware of it.

MIN and MAX: The Extremes

These find the smallest and largest values.

2
Find Salary Extremes

Add the maximum salary alongside the minimum to see both extremes.

Tables: techcorp_workforce

They work on text too — MIN gives you the first alphabetically, MAX gives you the last. And yes, they work on dates: MIN is the earliest, MAX is the most recent.

Using Multiple Aggregates Together

Here’s something nice: you can use as many aggregate functions as you want in a single query.

SQL
SELECT
  COUNT(*) AS total_employees,
  SUM(salary) AS total_payroll,
  AVG(salary) AS avg_salary,
  MIN(salary) AS min_salary,
  MAX(salary) AS max_salary
FROM techcorp_workforce;

One query, five useful numbers. This is the kind of summary you’d put at the top of a report.

3
Summarize Salaries

Calculate the total, average, minimum, and maximum salary. Give each column a clear alias.

Tables: techcorp_workforce

Samantha's and Lisa's Total Sales Revenue

Use SUM to add up values for specific conditions.

Table: sales_performance
salespersonwidget_salessales_revenueid
Jim810405001
Bobby661330502
Samantha1006503003
Taylor984492004
Tom403201505
4
Calculate Samantha's and Lisa's total sales revenue
View solution

What is the total sales revenue of Samantha and Lisa?

Tables: sales_performance

Olympics Events List By Age

Combine MIN, AVG, and MAX to summarize a column in one query.

Table: olympics_athletes_events
idnamesexageheightweightteamnocgamesyearseasoncitysporteventmedal
3520Guillermo J. AmparanMMexicoMEX1924 Summer1924SummerParisAthleticsAthletics Men's 800 metres
35394Henry John FinchettMGreat BritainGBR1924 Summer1924SummerParisGymnasticsGymnastics Men's Rings
21918Georg Frederik Ahrensborg ClausenM28DenmarkDEN1924 Summer1924SummerParisCyclingCycling Men's Road Race Individual
110345Marinus Cornelis Dick SigmondM26NetherlandsNED1924 Summer1924SummerParisFootballFootball Men's Football
54193Thodore Tho JeitzM26LuxembourgLUX1924 Summer1924SummerParisGymnasticsGymnastics Men's Individual All-Around
5
Olympics Events List By Age
View solution

Find the lowest, average, and the highest ages of athletes across all Olympics. HINT: If athlete participated in more than one discipline at one Olympic games, consider it as a separate athlete, no need to remove such edge cases.

Tables: olympics_athletes_events

Key Takeaways

  • COUNT, SUM, AVG, MIN, MAX are your five core aggregate functions
  • All of them except COUNT(*) ignore NULL values
  • COUNT(*) counts all rows; COUNT(column) counts non-NULL values
  • COUNT(DISTINCT column) counts unique values
  • Ask yourself: am I counting rows or entities?
  • You can combine multiple aggregates in one SELECT

What’s Next

Right now, you’re getting one number for the entire table. But what if you need revenue by region? Headcount by department? Average salary by job title? That’s where GROUP BY comes in. It lets you split your data into groups and aggregate each one separately.