Module 6: Windows Functions40 min

PARTITION BY and Ranking Functions

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

Ranking Within Groups

  • “Find the top 3 products by revenue in each category.” 
  • “Get the second-highest salary in each department.” 
  • “Rank customers by total spend.”

We’ve lost count of how many times ranking questions appear in interviews. They’re popular because they test whether you actually understand window functions or just memorized some syntax. Let’s make sure you understand them.

Three Functions, Three Behaviors

SQL gives you three ranking functions. They all assign numbers based on ORDER BY, but they handle ties differently. Choosing the wrong one is a common mistake.

  • ROW_NUMBER() assigns a unique number for every row. When ties happen, order is arbitrary (1, 2, 3, 4…).
  • RANK() gives the same number for ties, then skips (1, 1, 3, 4, 4, 6…).
  • DENSE_RANK() gives the same number for ties, no skip (1, 1, 2, 3, 3, 4…).

Seeing the Difference

Let me show you with actual data. We’ll rank employees by salary:

  • salary 100000 → ROW_NUMBER = 1, RANK = 1, DENSE_RANK = 1
  • salary 100000 → ROW_NUMBER = 2, RANK = 1, DENSE_RANK = 1
  • salary 90000 → ROW_NUMBER = 3, RANK = 3, DENSE_RANK = 2
  • salary 80000 → ROW_NUMBER = 4, RANK = 4, DENSE_RANK = 3
  • salary 80000 → ROW_NUMBER = 5, RANK = 4, DENSE_RANK = 3
  • salary 70000 → ROW_NUMBER = 6, RANK = 6, DENSE_RANK = 4

Two people earn 100k. ROW_NUMBER gives them 1 and 2 (arbitrary). RANK gives them both 1, then skips to 3. DENSE_RANK gives them both 1, then goes to 2.

Why RANK skips numbers

The "skip" in RANK means the next number reflects how many people are actually ahead of you. In sports rankings, if two people tie for 2nd, the next person is 4th, not 3rd.

Your First Ranking Function

SQL
SELECT
  cust_id,
  total_order_cost,
  ROW_NUMBER() OVER(
    PARTITION BY cust_id
    ORDER BY total_order_cost DESC
  ) AS rn
FROM orders;
Table: ms_employee_salary
idfirst_namelast_namesalarydepartment_id
1ToddWilson1100001006
1ToddWilson1061191006
2JustinSimon1289221005
2JustinSimon1300001005
3KellyRosario426891002
1
Rank Employees by Salary

Rank employees within each department by salary (highest first) using ROW_NUMBER().

Tables: ms_employee_salary

If two people have the same salary, who gets 1 and who gets 2? It’s arbitrary unless you add more columns to ORDER BY. This can bite you if you’re not careful.

The “Top N Per Group” Pattern

This is the big one. Add PARTITION BY to restart numbering for each group:

PostgreSQL
SELECT 
    first_name,
    department,
    salary,
    ROW_NUMBER() OVER (
        PARTITION BY department
        ORDER BY salary DESC
    ) AS dept_rank
FROM employees;

Now HR gets its own 1, 2, 3… Admin gets its own 1, 2, 3… Perfect for finding top N in each group.

RANK vs DENSE_RANK: When Ties Matter

If you’re asked for “top 3 by revenue” and two items tie for 2nd place:

  • RANK: positions 1, 2, 2, 4 → You get items at positions 1 and 2, but skip 3
  • DENSE_RANK: positions 1, 2, 2, 3 → You get items at positions 1, 2, and 3

Which is “right” depends on the business question. Usually, when someone asks for “top 3,” they mean DENSE_RANK behavior: they want three tiers, even if ties mean more than three items.

Interview tip: ask about ties

If the question is ambiguous about ties, ask for clarification. It shows you're thinking carefully. If you don't get clarification, default to DENSE_RANK.

SQL
WITH ranked AS (
  SELECT
    first_name,
    department,
    salary,
    DENSE_RANK() OVER (
      PARTITION BY department
      ORDER BY salary DESC
    ) AS dept_rank
  FROM employee
)
SELECT *
FROM ranked
WHERE dept_rank <= 2;

You can’t filter on window functions directly in WHERE. You have to wrap it in a CTE or subquery first. This trips people up in interviews all the time.

WHERE runs before window functions

WHERE runs before the SELECT clause, and window functions are evaluated in SELECT. The window function doesn't exist yet when WHERE runs. That's why you need a CTE or subquery to filter on window function results.

Common Mistakes

Forgetting ORDER BY

PostgreSQL
-- Wrong: ranking without ORDER BY is meaningless
ROW_NUMBER() OVER (PARTITION BY department);

-- Right: always specify what you're ranking BY
ROW_NUMBER() OVER (
    PARTITION BY department 
    ORDER BY salary DESC
);

Filtering in WHERE

PostgreSQL
-- Wrong: window functions can’t be in WHERE
SELECT *
FROM employees
WHERE ROW_NUMBER() OVER (…) = 1;

-- Right: use CTE, then filter
WITH ranked AS (
    SELECT …,
           ROW_NUMBER() OVER (…) AS rn
    FROM employees
)
SELECT *
FROM ranked
WHERE rn = 1;
SQL
SELECT
  first_name,
  salary,
  ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
  RANK() OVER (ORDER BY salary DESC) AS rank_val,
  DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank_val
FROM employee;

Ranking Most Active Guests

Find top performers within groups. You’ll need ranking with aggregation here.

Table: airbnb_contacts
id_guestid_hostid_listingts_contact_atts_reply_atts_accepted_atts_booking_atds_checkinds_checkoutn_guestsn_messages
86b39b70-965b-479d-a0b0-719b195acea21dfb22ec-c20e-4bf9-b161-1607afa25c5ad668de42-122a-45cd-b91f-91a70895f9022014-04-18 09:32:232014-04-18 09:39:062014-12-312015-01-0275
14f943bb-74e9-458b-be55-203dc72206883347390d-8670-4870-9dab-da30f370014114c47fb8-e831-4044-9674-9b3fd04991932014-10-06 06:55:452014-10-06 10:06:382014-10-06 10:06:382014-10-06 10:06:382014-11-032014-11-0728
425aa1ed-82ab-4ecf-b62f-d61e1848706d02cafb86-5445-45cc-80f2-405291578356c5a4a913-a094-4a9d-82e2-0b2d4f9d9eeb2014-10-04 05:02:392014-10-04 23:10:012014-11-022014-11-0922
bb490ede-8a70-4d61-a2e8-625855a393e2f49c3095-58de-4b8d-9d5b-3bfceceb47d827f4b429-d544-464f-b4b5-3c09fd5992e72014-08-31 11:46:112014-08-31 16:48:282014-11-032014-11-0725
b2fda15a-89bb-4e6e-ae81-8b21598e248271f1d49e-2ff4-4d72-b8e6-fd4c67feaa7495fb78ca-8e6e-436a-9830-949d995ad14f2014-10-08 15:07:562014-10-08 15:32:122014-10-08 15:32:122014-10-08 22:21:412014-11-062014-11-09210
2
Ranking Most Active Guests
View solution

Identify the most engaged guests by ranking them according to their overall messaging activity. The most active guest, meaning the one who has exchanged the most messages with hosts, should have the highest rank. If two or more guests have the same number of messages, they should have the same rank. Importantly, the ranking shouldn't skip any numbers, even if many guests share the same rank. Present your results in a clear format, showing the rank, guest identifier, and total number of messages for each guest, ordered from the most to least active.

Tables: airbnb_contacts

Combine ranking with a CTE to filter for top results.

Table: yelp_business
business_idnameneighborhoodaddresscitystatepostal_codelatitudelongitudestarsreview_countis_opencategories
G5ERFWvPfHy7IDAUYlWL2AAll Colors Mobile Bumper Repair7137 N 28th AvePhoenixAZ8505133.45-112.07141Auto Detailing;Automotive
0jDvRJS-z9zdMgOUXgr6rASunfare811 W Deer Valley RdPhoenixAZ8502733.68-112.085271Personal Chefs;Food;Gluten-Free;Food Delivery Services;Event Planning & Services;Restaurants
6HmDqeNNZtHMK0t2glF_ggDry Clean VegasSoutheast2550 Windmill Ln, Ste 100Las VegasNV8912336.04-115.12141Dry Cleaning & Laundry;Laundry Services;Local Services;Dry Cleaning
pbt3SBcEmxCfZPdnmU9tNAThe Cuyahoga Room740 Munroe Falls AveCuyahoga FallsOH4422141.14-81.47130Wedding Planning;Caterers;Event Planning & Services;Venues & Event Spaces
CX8pfLn7Bk9o2-8yDMp_2wThe UPS Store4815 E Carefree Hwy, Ste 108Cave CreekAZ8533133.8-111.981.551Notaries;Printing Services;Local Services;Shipping Centers
3
Top Businesses With Most Reviews
View solution

Find the top 5 businesses with most reviews. Assume that each row has a unique business_id such that the total reviews for each business is listed on each row. Output the business name along with the total number of reviews and order your results by the total reviews in descending order. If there are ties in review counts, businesses with the same number of reviews receive the same rank, and subsequent ranks are skipped accordingly (e.g., if two businesses tie for rank 4, the next business receives rank 6, skipping rank 5).

Tables: yelp_business

Find the second-highest salary. Classic DENSE_RANK() with CTE pattern.

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
4
Second Highest Salary
View solution

Find the second highest salary of employees.

Tables: employee

Key Takeaways

  • ROW_NUMBER: unique numbers, arbitrary for ties
  • RANK: ties get same number, then skip (1, 1, 3…)
  • DENSE_RANK: ties get same number, no skip (1, 1, 2…)
  • Always include ORDER BY in ranking functions
  • Filter on ranks using a CTE or subquery, not WHERE

What’s Next

Ranking tells you position. But what about comparing to the previous row? What was last month’s revenue? How does this quarter compare to last? That’s LAG and LEAD, and they’re up next.