Introduction to Window Functions
Progress Tracking
Log in to save this lesson and continue from where you left off.
The Mental Shift
Window functions are one of the most common topics in data analyst SQL interviews. We’ve seen candidates who knew everything else get stuck because they couldn’t rank items within groups or calculate month-over-month change.
The good news? Once window functions click, they’re not that hard. And they’re genuinely useful. You’ll use them constantly in real analytics work, not just interviews.
The Frustrating Limitation You’ve Probably Hit
Here’s a situation you’ve likely encountered: your manager asks for a report showing each employee’s salary alongside their department’s average. “Simple,” you think. Then you start writing the query and realize… it’s not simple at all.
You know how to get employee salaries:
SELECT
first_name,
department,
salary
FROM employees;And you know how to get department averages:
SELECT
department,
AVG(salary)
FROM employees
GROUP BY department;But combining them? That requires a subquery or a self-join, and it gets ugly fast. You end up with something that works but feels like you’re fighting SQL instead of using it.
Window functions solve this exact problem. They let you calculate aggregates while keeping every row.
The Key Insight: Calculate Without Collapsing
GROUP BY has one job: collapse rows into groups and give you one row per group. That’s great when you want a summary. But sometimes you want both the detail and the summary together.
Window functions give you that. They perform calculations across a “window” of rows, but they don’t collapse anything. Every original row stays in your result.
GROUP BY: You Get One Row Per Group
SELECT
department,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department;Result: 3 rows. One for HR, one for Admin, one for Account.
departmentHR →avg_salary137500departmentAdmin →avg_salary290000departmentAccount →avg_salary116667
Your First Window Function
SELECT
cust_id,
total_order_cost,
AVG(total_order_cost) OVER() AS overall_avg
FROM orders;| id | first_name | last_name | salary | department_id |
|---|---|---|---|---|
| 1 | Todd | Wilson | 110000 | 1006 |
| 1 | Todd | Wilson | 106119 | 1006 |
| 2 | Justin | Simon | 128922 | 1005 |
| 2 | Justin | Simon | 130000 | 1005 |
| 3 | Kelly | Rosario | 42689 | 1002 |
Show each employee with their department average salary using AVG() OVER(PARTITION BY).
See how employees in the same department all show the same dept_avg? The calculation happens across the group, but we keep each person’s individual row.
Window functions add a column to your existing rows. They don't change how many rows you get back.
SELECT
first_name,
department_id,
salary,
AVG(salary) OVER() AS overall_avg,
AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
salary - AVG(salary) OVER (PARTITION BY department_id) AS vs_dept_avg
FROM ms_employee_salary;An empty OVER() means “calculate this across all rows.” Everyone gets the same overall average, but you see it next to their individual salary.
PARTITION BY: Creating Sub-Windows
PARTITION BY inside the OVER() clause creates groups, similar to GROUP BY. But again, it doesn’t collapse rows.
Think of PARTITION BY as drawing invisible boxes around your data. The calculation happens inside each box, but every row stays visible.
SELECT
first_name,
last_name,
salary,
AVG(salary) OVER() AS overall_avg
FROM ms_employee_salary;Finding Updated Records
Use ROW_NUMBER() to identify which version of each record is the most recent. This is a classic real-world pattern: you have multiple versions of something and need to find the latest.
| id | first_name | last_name | salary | department_id |
|---|---|---|---|---|
| 1 | Todd | Wilson | 110000 | 1006 |
| 1 | Todd | Wilson | 106119 | 1006 |
| 2 | Justin | Simon | 128922 | 1005 |
| 2 | Justin | Simon | 130000 | 1005 |
| 3 | Kelly | Rosario | 42689 | 1002 |
We have a table with employees and their salaries; however, some of the records are old and contain outdated salary information. Since there is no timestamp, assume salary is non-decreasing over time. You can consider the current salary for an employee is the largest salary value among their records. If multiple records share the same maximum salary, return any one of them. Output their id, first name, last name, department ID, and current salary. Order your list by employee ID in ascending order.
Key Takeaways
- Window functions calculate across rows without collapsing them
OVER()is what makes something a window functionPARTITION BYcreates groups within the window (likeGROUP BY, but keeps all rows)- Ask yourself: “Collapse rows or keep them?” That tells you which to use
- This concept comes up frequently in interviews.
What’s Next
Now that you get the concept, let’s look at ranking functions. ROW_NUMBER, RANK, and DENSE_RANK are the most common window functions you’ll see in interviews. They’re straightforward once you understand the difference between them.