Module 6: Windows Functions30 min

Introduction to Window Functions

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

The Mental Shift

Window functions are one of the most common topics in data analyst SQL interviews. We’ve seen candidates who knew everything else get stuck because they couldn’t rank items within groups or calculate month-over-month change.

The good news? Once window functions click, they’re not that hard. And they’re genuinely useful. You’ll use them constantly in real analytics work, not just interviews.

The Frustrating Limitation You’ve Probably Hit

Here’s a situation you’ve likely encountered: your manager asks for a report showing each employee’s salary alongside their department’s average. “Simple,” you think. Then you start writing the query and realize… it’s not simple at all.

You know how to get employee salaries:

PostgreSQL
SELECT 
    first_name,
    department,
    salary
FROM employees;

And you know how to get department averages:

PostgreSQL
SELECT 
    department,
    AVG(salary)
FROM employees
GROUP BY department;

But combining them? That requires a subquery or a self-join, and it gets ugly fast. You end up with something that works but feels like you’re fighting SQL instead of using it.

Window functions solve this exact problem. They let you calculate aggregates while keeping every row.

The Key Insight: Calculate Without Collapsing

GROUP BY has one job: collapse rows into groups and give you one row per group. That’s great when you want a summary. But sometimes you want both the detail and the summary together.

Window functions give you that. They perform calculations across a “window” of rows, but they don’t collapse anything. Every original row stays in your result.

GROUP BY: You Get One Row Per Group

PostgreSQL
SELECT 
    department,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department;

Result: 3 rows. One for HR, one for Admin, one for Account.

  • department HR → avg_salary 137500
  • department Admin → avg_salary 290000
  • department Account → avg_salary 116667

Your First Window Function

SQL
SELECT
  cust_id,
  total_order_cost,
  AVG(total_order_cost) OVER() AS overall_avg
FROM orders;
Table: ms_employee_salary
idfirst_namelast_namesalarydepartment_id
1ToddWilson1100001006
1ToddWilson1061191006
2JustinSimon1289221005
2JustinSimon1300001005
3KellyRosario426891002
1
Add Department Average

Show each employee with their department average salary using AVG() OVER(PARTITION BY).

Tables: ms_employee_salary

See how employees in the same department all show the same dept_avg? The calculation happens across the group, but we keep each person’s individual row.

The mental shift

Window functions add a column to your existing rows. They don't change how many rows you get back.

SQL
SELECT
  first_name,
  department_id,
  salary,
  AVG(salary) OVER() AS overall_avg,
  AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
  salary - AVG(salary) OVER (PARTITION BY department_id) AS vs_dept_avg
FROM ms_employee_salary;

An empty OVER() means “calculate this across all rows.” Everyone gets the same overall average, but you see it next to their individual salary.

PARTITION BY: Creating Sub-Windows

PARTITION BY inside the OVER() clause creates groups, similar to GROUP BY. But again, it doesn’t collapse rows.

Think of PARTITION BY as drawing invisible boxes around your data. The calculation happens inside each box, but every row stays visible.

SQL
SELECT
  first_name,
  last_name,
  salary,
  AVG(salary) OVER() AS overall_avg
FROM ms_employee_salary;

Finding Updated Records

Use ROW_NUMBER() to identify which version of each record is the most recent. This is a classic real-world pattern: you have multiple versions of something and need to find the latest.

Table: ms_employee_salary
idfirst_namelast_namesalarydepartment_id
1ToddWilson1100001006
1ToddWilson1061191006
2JustinSimon1289221005
2JustinSimon1300001005
3KellyRosario426891002
2
Finding Updated Records
View solution

We have a table with employees and their salaries; however, some of the records are old and contain outdated salary information. Since there is no timestamp, assume salary is non-decreasing over time. You can consider the current salary for an employee is the largest salary value among their records. If multiple records share the same maximum salary, return any one of them. Output their id, first name, last name, department ID, and current salary. Order your list by employee ID in ascending order.

Tables: ms_employee_salary

Key Takeaways

  • Window functions calculate across rows without collapsing them
  • OVER() is what makes something a window function
  • PARTITION BY creates groups within the window (like GROUP BY, but keeps all rows)
  • Ask yourself: “Collapse rows or keep them?” That tells you which to use
  • This concept comes up frequently in interviews.

What’s Next

Now that you get the concept, let’s look at ranking functions. ROW_NUMBER, RANK, and DENSE_RANK are the most common window functions you’ll see in interviews. They’re straightforward once you understand the difference between them.