Patterns for Common Problems
Progress Tracking
Log in to save this lesson and continue from where you left off.
Safe Division
Division by zero is a common source of production bugs. Your analysis works perfectly on test data, then crashes at 2 AM when one department has zero employees. Always protect division — it’s a habit that separates hobby code from production code.
# Dangerous: crashes if denominator is 0
ratio = numerator / denominator
# Safe: returns NaN instead of crashing
denominator = denominator.replace(0, float("nan"))
ratio = numerator / denominatorFor Series division, use .replace(0, float("nan")) on the denominator. For scalar division, check if total > 0 before dividing.
Pre-Aggregate to Avoid Slow Loops
If you find yourself writing a loop over groups, stop. .transform() or groupby-then-merge is almost always faster and cleaner.
# Slow: Python loop over departments
for dept in employee['department'].unique():
dept_avg = employee[employee['department'] == dept]['salary'].mean()
...
# Fast: one-line transform
employee["dept_avg"] = (
employee.groupby("department")["salary"].transform("mean")
)Transform Then Filter
The starter computes department averages. Add a filter for above-average earners and select the relevant columns.
When a One-Liner is Better
Not everything needs multiple steps. Simple lookups are cleaner as one-liners:
# One-liner: perfectly readable
employee[employee["salary"] == employee["salary"].max()]
# Also works, but more code than needed: same result, more code
max_sal = employee["salary"].max()
employee[employee["salary"] == max_sal]Use intermediate variables when: the logic has 3+ steps, you need to reuse a result, or the one-liner is hard to read.
Ratio Calculation
Calculate each department’s total salary as a percentage of the company total. Show only departments over 20%.
Highest Target Under Manager
| id | first_name | last_name | age | sex | employee_title | department | salary | target | bonus | city | address | manager_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Allen | Wang | 55 | F | Manager | Management | 200000 | 600 | 300 | Allen@company.com | California | 1069 Ventura Drive | 1 |
| 20 | Sarrah | Bicky | 31 | F | Senior Sales | Sales | 2000 | 500 | 150 | Sarrah@company.com | Florida | 1176 Tyler Avenue | 19 |
| 7 | Nicky | Bat | 33 | F | Sales | Sales | 1400 | 400 | 100 | Molly@company.com | Arizona | 3461 Preston Street | 13 |
| 31 | Steve | Smith | 39 | M | Sales | Sales | 1500 | 400 | 125 | steve@company.com | Texas | 13 | |
| 32 | David | Warner | 38 | M | Sales | Sales | 1200 | 400 | 150 | david@company.com | Florida | 13 |
Identify the employee(s) working under manager `manager_id=13` who have achieved the highest target. Return each such employee’s first name alongside the target value. The goal is to display the maximum target among all employees under `manager_id=13` and show which employee(s) reached that top value.
Income By Title and Gender
| id | first_name | last_name | age | sex | employee_title | department | salary | target | city | address | manager_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | Max | George | 26 | M | Sales | Sales | 1300 | 200 | Max@company.com | California | 2638 Richards Avenue | 1 |
| 13 | Katty | Bond | 56 | F | Manager | Management | 150000 | 0 | Katty@company.com | Arizona | 1 | |
| 11 | Richerd | Gear | 57 | M | Manager | Management | 250000 | 0 | Richerd@company.com | Alabama | 1 | |
| 10 | Jennifer | Dion | 34 | F | Sales | Sales | 1000 | 200 | Jennifer@company.com | Alabama | 13 | |
| 19 | George | Joe | 50 | M | Manager | Management | 100000 | 0 | George@company.com | Florida | 1003 Wyatt Street | 1 |
| worker_ref_id | bonus |
|---|---|
| 1 | 5000 |
| 2 | 3000 |
| 3 | 4000 |
| 1 | 4500 |
| 2 | 3500 |
Find the average total compensation based on employee titles and gender. Total compensation is calculated by adding both the salary and bonus of each employee. However, not every employee receives a bonus so disregard employees without bonuses in your calculation. Employee can receive more than one bonus. Output the employee title, gender (i.e., sex), along with the average total compensation.
Top Cool Votes
| business_name | review_id | user_id | stars | review_date | review_text | funny | useful | cool |
|---|---|---|---|---|---|---|---|---|
| AutohausAZ | C4TSIEcazRay0qIRPeMAFg | jlbPUcCRiXlMtarzi9sW5w | 5 | 2011-06-27 | Autohaus is my main source for parts for an old Mercedes that has been in the family since new that I am restoring. The old beater is truly a labor of | 1 | 2 | 1 |
| Citizen Public House | ZZ0paqUsSX-VJbfodTp1cQ | EeCWSGwMAPzwe_c1Aumd1w | 4 | 2013-03-18 | First time in PHX. Friend recommended. Friendly waitstaff. They were understaffed and extremely busy, but we never knew. The short ribs were tende | 0 | 0 | 0 |
| Otto Pizza & Pastry | pF6W5JOPBK6kOXTB58cYrw | JG1Gd2mN2Qk7UpCqAUI-BQ | 5 | 2013-03-14 | LOVE THIS PIZZA! This is now one of my favorite pizza places in phoenix area. My husband i walked into this cute family owned business and were greet | 0 | 0 | 0 |
| Giant Hamburgers | QBddRcflAcXwE2qhsLVv7w | T90ybanuLhAr0_s99GDeeg | 3 | 2009-03-27 | ok, so I tried this place out based on other reviews. I had the cheeseburger with everything, some fries, and a chocolate shake. The burger was okay, | 0 | 1 | 1 |
| Tammie Coe Cakes | Y8UMm_Ng9oEpJbIygoGbZQ | MWt24-6bfv_OHLKhwMQ0Tw | 3 | 2008-08-25 | Overrated. The treats are tasty but certainly not the best I have ever had. I would have rated this a two star but the cakes and cookies are REALLY pr | 1 | 3 | 2 |
Find the `review_text` that received the highest number of `cool` votes. Output the business name along with the review text with the highest number of `cool` votes.
Key Takeaways
- Protect division:
.replace(0, float("nan"))orif total > 0. - Replace loops with
.transform()or groupby-then-merge. - Use intermediate variables for 3+ steps; one-liners for simple lookups.
- Every variable is a checkpoint for debugging.
What’s Next
You can now break complex analyses into clean, debuggable steps. Next: custom logic with .apply() — for when built-in methods can’t do what you need.