Module 4: Multi-Step Analysis25 min

Chaining Analysis Steps

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

Each variable is a checkpoint you can inspect. When something’s wrong, print any intermediate result to find the problem.

Name Your Variables Well

customer_totals tells you what’s in it. temp1 tells you nothing. Good variable names are documentation.

Adding Context with Merge

1
Enrich with Customer Names

The starter finds above-average spenders. Add a merge to show customer names alongside their totals.

Tables: orders, customers

Ratios and Percentages

Percentage calculations follow a template: compute the per-group number, the total, then divide.

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
2
Department Headcount Percentages

Calculate what percentage of the company each department represents by headcount.

Tables: employee

Customers with Large Orders

Table: online_store_customers
customer_idcustomer_name
1Alice Johnson
2Bob Smith
3Carol Williams
4David Brown
5Emma Davis
Table: online_store_orders
order_idcustomer_idamountstatus
1011150paid
1021200paid
103175paid
1042250paid
1053180paid
3
Customers with Large Orders
View solution

The marketing team wants to identify high-value customers for a premium loyalty program. Find all customers who have placed at least one order over $100. Return `customer ID` and `name`.

Tables: online_store_customers, online_store_orders

Employee and Manager Salaries

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1
4
Employee and Manager Salaries
View solution

Find employees who are earning more than their managers. Output the employee's first name along with the corresponding salary.

Tables: employee

Key Takeaways

  • Name each step as a variable — this is pandas’ equivalent of CTEs.
  • Each variable is a checkpoint you can inspect for debugging.
  • Chain: filter → group → aggregate → merge → filter again.
  • For ratios: compute numerator and denominator separately, then divide.

What’s Next

The final lesson covers common patterns you’ll use regularly: safe division, replacing slow loops, and knowing when a one-liner is better than multiple steps.