Introduction to Aggregate Functions
Progress Tracking
Log in to save this lesson and continue from where you left off.
Beyond Individual Rows
In Module 1, you learned to retrieve and filter individual rows. That’s useful, but it’s not how most business questions are phrased.
Nobody asks, “Show me row 47 from the orders table.” They ask:
- How many customers do we have?
- What’s our total revenue this quarter?
- What’s the average order value?
These questions need you to crunch multiple rows down into a single answer.
That’s what aggregate functions do. They take a pile of rows and collapse them into one summary value.
The Five Functions You’ll Use Constantly
SQL gives you five core aggregate functions. You’ll use these in almost every analysis you do.
COUNT: How Many?
COUNT is probably the one you’ll reach for most. It tells you how many rows match your criteria.
SELECT COUNT(*)
FROM techcorp_workforce;That’s it. One number back: the total rows in the table.
But COUNT has three flavors, and knowing the difference matters:
-- Count all rows (including those with NULLs)
SELECT COUNT(*)
FROM techcorp_workforce;
-- Count rows where department isn’t NULL
SELECT COUNT(department)
FROM techcorp_workforce;
-- Count unique departments
SELECT COUNT(DISTINCT department)
FROM techcorp_workforce;Rows vs Entities: Why DISTINCT Matters
In real data, you often have multiple rows per entity. Say your orders table has 5 rows for customer 101 (five separate orders). COUNT(*) gives you 5 (the number of orders), but COUNT(DISTINCT customer_id) gives you 1 (the number of customers).
Always ask yourself: Am I counting rows or entities? Getting this wrong is one of the most common sources of incorrect numbers in reports.
SUM: Add It Up
SUM does exactly what you’d expect.
| id | first_name | last_name | department | salary | phone_number | joining_date |
|---|---|---|---|---|---|---|
| 1 | Sarah | Mitchell | HR | 95000 | 555-0101 | 2021-03-15 |
| 2 | Michael | Chen | HR | 88000 | 555-0102 | 2022-06-01 |
| 3 | Emily | Rodriguez | HR | 82500 | 2021-09-20 | |
| 4 | David | Park | HR | 80000 | 555-0104 | 2023-01-10 |
| 5 | Lisa | Thompson | HR | 65000 | 2021-04-05 |
Calculate the total payroll by summing all salaries. Alias it as total_payroll.
One thing to know: SUM ignores NULLs. If you have salaries of 50000, 60000, and NULL, you get 110000, not an error.
AVG: The Average
AVG calculates the mean:
SELECT AVG(salary) AS average_salary
FROM techcorp_workforce;AVG also ignores NULLs. If you have values 100, 200, and NULL, the average is 150 (300 divided by 2), not 100 (300 divided by 3). This is usually what you want, but be aware of it.
MIN and MAX: The Extremes
These find the smallest and largest values.
Add the maximum salary alongside the minimum to see both extremes.
They work on text too — MIN gives you the first alphabetically, MAX gives you the last. And yes, they work on dates: MIN is the earliest, MAX is the most recent.
Using Multiple Aggregates Together
Here’s something nice: you can use as many aggregate functions as you want in a single query.
SELECT
COUNT(*) AS total_employees,
SUM(salary) AS total_payroll,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM techcorp_workforce;One query, five useful numbers. This is the kind of summary you’d put at the top of a report.
Calculate the total, average, minimum, and maximum salary. Give each column a clear alias.
Samantha's and Lisa's Total Sales Revenue
Use SUM to add up values for specific conditions.
| salesperson | widget_sales | sales_revenue | id |
|---|---|---|---|
| Jim | 810 | 40500 | 1 |
| Bobby | 661 | 33050 | 2 |
| Samantha | 1006 | 50300 | 3 |
| Taylor | 984 | 49200 | 4 |
| Tom | 403 | 20150 | 5 |
What is the total sales revenue of Samantha and Lisa?
Olympics Events List By Age
Combine MIN, AVG, and MAX to summarize a column in one query.
| id | name | sex | age | height | weight | team | noc | games | year | season | city | sport | event | medal |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3520 | Guillermo J. Amparan | M | Mexico | MEX | 1924 Summer | 1924 | Summer | Paris | Athletics | Athletics Men's 800 metres | ||||
| 35394 | Henry John Finchett | M | Great Britain | GBR | 1924 Summer | 1924 | Summer | Paris | Gymnastics | Gymnastics Men's Rings | ||||
| 21918 | Georg Frederik Ahrensborg Clausen | M | 28 | Denmark | DEN | 1924 Summer | 1924 | Summer | Paris | Cycling | Cycling Men's Road Race Individual | |||
| 110345 | Marinus Cornelis Dick Sigmond | M | 26 | Netherlands | NED | 1924 Summer | 1924 | Summer | Paris | Football | Football Men's Football | |||
| 54193 | Thodore Tho Jeitz | M | 26 | Luxembourg | LUX | 1924 Summer | 1924 | Summer | Paris | Gymnastics | Gymnastics Men's Individual All-Around |
Find the lowest, average, and the highest ages of athletes across all Olympics. HINT: If athlete participated in more than one discipline at one Olympic games, consider it as a separate athlete, no need to remove such edge cases.
Key Takeaways
COUNT,SUM,AVG,MIN,MAXare your five core aggregate functions- All of them except
COUNT(*)ignoreNULLvalues COUNT(*)counts all rows;COUNT(column)counts non-NULLvaluesCOUNT(DISTINCT column)counts unique values- Ask yourself: am I counting rows or entities?
- You can combine multiple aggregates in one
SELECT
What’s Next
Right now, you’re getting one number for the entire table. But what if you need revenue by region? Headcount by department? Average salary by job title? That’s where GROUP BY comes in. It lets you split your data into groups and aggregate each one separately.