Ranking
Progress Tracking
Log in to save this lesson and continue from where you left off.
The .rank() Method
Ranking within groups is a frequent window function pattern in interviews: "top 3 products per category", "highest earner per department", "most recent order per customer". The .rank() method’s method parameter gives you three different ranking behaviors — each maps to a different SQL window function. Knowing when to use which is crucial.
.rank() assigns ranks. The method parameter controls how ties are handled:
# method="first" = ROW_NUMBER (no ties, by position)
df["row_num"] = df.groupby("dept")["salary"].rank(
method="first", ascending=False
)
# method="min" = RANK (ties get same rank, gaps after)
df["rank"] = df.groupby("dept")["salary"].rank(
method="min", ascending=False
)
# method="dense" = DENSE_RANK (ties same, no gaps)
df["dense_rank"] = df.groupby("dept")["salary"].rank(
method="dense", ascending=False
)Comparing All Three Methods
df = ms_employee_salary.copy()
df["row_num"] = df.groupby("department_id")["salary"].rank(
method="first", ascending=False
)
df["rank"] = df.groupby("department_id")["salary"].rank(
method="min", ascending=False
)
df["dense"] = df.groupby("department_id")["salary"].rank(
method="dense", ascending=False
)
df.sort_values(["department_id", "salary"], ascending=[True, False])The Top-N Per Group Pattern
Rank employees by salary within each department, then filter to the top 2.
Building It Yourself
Find the highest-paid employee in each department. Only one per department.
For the top 1 per group, .groupby().apply(lambda g: g.nlargest(1, "salary")) also works. But .rank() + filter is more flexible and closer to SQL’s ROW_NUMBER() pattern.
Ranking Most Active Guests
| guest_id | nationality | gender | age |
|---|---|---|---|
| 0 | Mali | M | 21 |
| 1 | China | F | 23 |
| 2 | Mali | F | 27 |
| 3 | Australia | F | 24 |
| 4 | Luxembourg | M | 19 |
Identify the most engaged guests by ranking them according to their overall messaging activity. The most active guest, meaning the one who has exchanged the most messages with hosts, should have the highest rank. If two or more guests have the same number of messages, they should have the same rank. Importantly, the ranking shouldn't skip any numbers, even if many guests share the same rank. Present your results in a clear format, showing the rank, guest identifier, and total number of messages for each guest, ordered from the most to least active.
Top Businesses With Most Reviews
| business_id | name | neighborhood | address | city | state | postal_code | latitude | longitude | stars | review_count | is_open | categories |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G5ERFWvPfHy7IDAUYlWL2A | All Colors Mobile Bumper Repair | 7137 N 28th Ave | Phoenix | AZ | 85051 | 33.45 | -112.07 | 1 | 4 | 1 | Auto Detailing;Automotive | |
| 0jDvRJS-z9zdMgOUXgr6rA | Sunfare | 811 W Deer Valley Rd | Phoenix | AZ | 85027 | 33.68 | -112.08 | 5 | 27 | 1 | Personal Chefs;Food;Gluten-Free;Food Delivery Services;Event Planning & Services;Restaurants | |
| 6HmDqeNNZtHMK0t2glF_gg | Dry Clean Vegas | Southeast | 2550 Windmill Ln, Ste 100 | Las Vegas | NV | 89123 | 36.04 | -115.12 | 1 | 4 | 1 | Dry Cleaning & Laundry;Laundry Services;Local Services;Dry Cleaning |
| pbt3SBcEmxCfZPdnmU9tNA | The Cuyahoga Room | 740 Munroe Falls Ave | Cuyahoga Falls | OH | 44221 | 41.14 | -81.47 | 1 | 3 | 0 | Wedding Planning;Caterers;Event Planning & Services;Venues & Event Spaces | |
| CX8pfLn7Bk9o2-8yDMp_2w | The UPS Store | 4815 E Carefree Hwy, Ste 108 | Cave Creek | AZ | 85331 | 33.8 | -111.98 | 1.5 | 5 | 1 | Notaries;Printing Services;Local Services;Shipping Centers |
Find the top 5 businesses with most reviews. Assume that each row has a unique business_id such that the total reviews for each business is listed on each row. Output the business name along with the total number of reviews and order your results by the total reviews in descending order. If there are ties in review counts, businesses with the same number of reviews receive the same rank, and subsequent ranks are skipped accordingly (e.g., if two businesses tie for rank 4, the next business receives rank 6, skipping rank 5).
Second Highest Salary
| id | first_name | last_name | age | sex | employee_title | department | salary | target | bonus | city | address | manager_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | Max | George | 26 | M | Sales | Sales | 1300 | 200 | 150 | Max@company.com | California | 2638 Richards Avenue | 1 |
| 13 | Katty | Bond | 56 | F | Manager | Management | 150000 | 0 | 300 | Katty@company.com | Arizona | 1 | |
| 11 | Richerd | Gear | 57 | M | Manager | Management | 250000 | 0 | 300 | Richerd@company.com | Alabama | 1 | |
| 10 | Jennifer | Dion | 34 | F | Sales | Sales | 1000 | 200 | 150 | Jennifer@company.com | Alabama | 13 | |
| 19 | George | Joe | 50 | M | Manager | Management | 100000 | 0 | 300 | George@company.com | Florida | 1003 Wyatt Street | 1 |
Find the second highest salary of employees.
Key Takeaways
method="first"= ROW_NUMBER (unique ranks).method="min"= RANK (ties share rank, gaps after).method="dense"= DENSE_RANK (ties share rank, no gaps).- Top-N per group: rank → filter. A key interview pattern.
ascending=Falsefor highest-first ranking.
What’s Next
Ranking tells you position. But what about comparing to the previous or next row? That’s .shift(), the pandas equivalent of LAG/LEAD.