Module 2: Aggregation & Grouping20 min

Filtering Groups

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

The Problem

You know how to filter rows before grouping — that’s just boolean indexing before .groupby(). But what if you want to filter based on the aggregated result? For example: only departments with more than 5 employees, or only months with revenue over $100K.

You can’t do this with a regular filter before grouping, because the count doesn’t exist until after you group. You need to filter the grouped result.

Filtering After Grouping

Python
order_counts = (
    orders.groupby("cust_id")["id"]
    .count()
    .reset_index(name="order_count")
)
order_counts[order_counts["order_count"] >= 3]
Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
1
Find Large Departments

Group `techcorp_workforce` by department, count employees, and keep only departments with 3 or more.

Tables: techcorp_workforce

That's it. Group, aggregate into a new DataFrame, then filter it like any other DataFrame. No special syntax needed.

This Is Pandas' HAVING

In SQL, you'd use HAVING COUNT(*) > 5 to filter groups. In pandas, you just filter the grouped result with boolean indexing — the same technique you already know.

This distinction matters:

  • Before grouping — filter individual rows (e.g., only employees hired after 2022)
  • After grouping — filter groups by their aggregated values (e.g., only departments with 5+ people)

You can combine both:

2
Filter Before and After

Filter to recent hires first, then find departments with more than 2 of those hires.

Tables: techcorp_workforce

Multiple Conditions on Groups

3
Multiple Group Filters

Find departments with more than 2 employees and an average salary over 75,000.

Tables: techcorp_workforce

Departments With 5 Employees

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
4
Departments With 5 Employees
View solution

Find departments with at more than or equal 5 employees.

Tables: employee

Key Takeaways

  • Filter before grouping to narrow which rows enter the groups.
  • Filter after grouping to keep only groups meeting a threshold.
  • No special syntax — just boolean indexing on the grouped result.
  • Combine both: filter rows → group → aggregate → filter groups.

What’s Next

Aggregations silently skip NaN values. That’s usually fine, but sometimes it produces misleading numbers. The next lesson covers how NaN behaves in aggregations and how to handle it.