Module 1: DataFrame Fundamentals•50 min

Combining Conditions

Progress Tracking

Multiple Conditions

A single boolean filter gets you pretty far, but real questions usually need more:

Find employees in HR who earn over 80,000
Show me orders from March or April
Get everyone except the Admin department

That’s where &, |, and ~ come in. They let you combine multiple conditions into one filter.

& (AND): Both Conditions Must Be True

Python

orders[
    (orders["order_details"] == "Coat") &
    (orders["total_order_cost"] > 100)
]

Table: techcorp_workforce

id	first_name	last_name	department	salary	phone_number	joining_date
1	Sarah	Mitchell	HR	95000	555-0101	2021-03-15
2	Michael	Chen	HR	88000	555-0102	2022-06-01
3	Emily	Rodriguez	HR	82500		2021-09-20
4	David	Park	HR	80000	555-0104	2023-01-10
5	Lisa	Thompson	HR	65000		2021-04-05

Tables: techcorp_workforce

Both conditions must be true for a row to be included.

Parentheses Are Required

In pandas, each condition must be wrapped in parentheses when combining with & or |. Without them, Python’s operator precedence causes errors. This is the most common mistake with combined filters.

Chaining Multiple Conditions

You can chain as many & conditions as you need:

Tables: techcorp_workforce

Each & adds another filter, narrowing your results.

Churro Activity Date

Table: los_angeles_restaurant_health_inspections

serial_number	activity_date	facility_name	score	grade	service_code	service_description	employee_id	facility_address	facility_city	facility_id	facility_state	facility_zip	owner_id	owner_name	pe_description	program_element_pe	program_name	program_status	record_id
DAQHRSETQ	2017-06-08	MARGARITAS CAFE	93	A	1	ROUTINE INSPECTION	EE0000006	5026 S CRENSHAW BLVD	LOS ANGELES	FA0023656	CA	90043	OW0004133	BAZAN, ASCENCION	RESTAURANT (61-150) SEATS HIGH RISK	1638	MARGARITAS CAFE	ACTIVE	PR0011718
DA2GQRJOS	2017-03-07	LAS MOLENDERAS	97	A	1	ROUTINE INSPECTION	EE0000997	2635 WHITTIER BLVD	LOS ANGELES	FA0160416	CA	90023	OW0125379	MARISOL FEREGRINO	RESTAURANT (0-30) SEATS HIGH RISK	1632	LAS MOLENDERAS	INACTIVE	PR0148504
DAMQTA46T	2016-03-22	SANDRA'S TAMALES	93	A	1	ROUTINE INSPECTION	EE0001049	5390 WHITTIER BLVD	LOS ANGELES	FA0171769	CA	90022-4032	OW0178828	SANDRA'S TAMALES INC.	RESTAURANT (0-30) SEATS MODERATE RISK	1631	SANDRA'S TAMALES	ACTIVE	PR0164225
DAXMBTIRZ	2018-02-12	CAFE GRATITUDE	97	A	1	ROUTINE INSPECTION	EE0000828	639 N LARCHMONT BLVD STE #102	LOS ANGELES	FA0058921	CA	90004	OW0005704	CAFE GRATITUDE LARCHMONT LLC	RESTAURANT (61-150) SEATS HIGH RISK	1638	CAFE GRATITUDE	ACTIVE	PR0019854
DAK8TBMS0	2015-09-10	THE WAFFLE	90	A	1	ROUTINE INSPECTION	EE0000709	6255 W SUNSET BLVD STE #105	LOS ANGELES	FA0051830	CA	90028	OW0035796	THE WAFFLE, LLC	RESTAURANT (61-150) SEATS HIGH RISK	1638	THE WAFFLE	ACTIVE	PR0010922

Tables: los_angeles_restaurant_health_inspections

| (OR): Either Condition Can Be True

Python

orders[
    (orders["total_order_cost"] > 500) |
    (orders["order_details"] == "Coat")
]

Tables: techcorp_workforce

A row only needs to match one condition to be included. HR employees are in, Admin employees are in, Engineering employees are out.

Use .isin() for Multiple Values on the Same Column

When checking the same column for multiple values, .isin() (covered later in this lesson) is often cleaner than chaining | conditions. For example: df[df["department"].isin(["HR", "Admin"])].

Lyft Driver Wages

Table: lyft_drivers

index	start_date	yearly_salary
0	2018-04-02	48303
1	2018-05-30	67973
2	2015-04-05	56685
3	2015-01-08	51320
4	2017-03-09	67507

Tables: lyft_drivers

Matching a List of Values with .isin()

Chaining | for the same column gets ugly fast. .isin() replaces it with a single clean expression.

Instead of chaining multiple == comparisons with |, use .isin():

Python

# Instead of this:
df[(df["dept"] == "HR") | (df["dept"] == "Admin")]

# Do this:
df[df["dept"].isin(["HR", "Admin"])]

Matching Multiple Values

Tables: techcorp_workforce

Python

# Exclude with ~
techcorp_workforce[
    ~techcorp_workforce["department"].isin(["HR", "Admin"])
]

Use .isin() Instead of Chaining |

.isin() is cleaner and faster than chaining multiple | comparisons. It also works with any iterable: lists, sets, or another column’s values.

Athletes Over 40 With Bronze or Silver

Table: olympics_athletes_events

id	name	sex	age	team	noc	games	year	season	city	sport	event
3520	Guillermo J. Amparan	M		Mexico	MEX	1924 Summer	1924	Summer	Paris	Athletics	Athletics Men's 800 metres
35394	Henry John Finchett	M		Great Britain	GBR	1924 Summer	1924	Summer	Paris	Gymnastics	Gymnastics Men's Rings
21918	Georg Frederik Ahrensborg Clausen	M	28	Denmark	DEN	1924 Summer	1924	Summer	Paris	Cycling	Cycling Men's Road Race Individual
110345	Marinus Cornelis Dick Sigmond	M	26	Netherlands	NED	1924 Summer	1924	Summer	Paris	Football	Football Men's Football
54193	Thodore Tho Jeitz	M	26	Luxembourg	LUX	1924 Summer	1924	Summer	Paris	Gymnastics	Gymnastics Men's Individual All-Around

Tables: olympics_athletes_events

Range Filtering with .between()

Python

orders[orders["total_order_cost"].between(50, 200)]

Tables: techcorp_workforce

.between() Is Inclusive

.between(80000, 120000) includes both endpoints. It’s equivalent to (df["salary"] >= 80000) & (df["salary"] <= 120000).

Python

# .between() with dates
techcorp_workforce[techcorp_workforce["joining_date"].between("2022-01-01", "2022-12-31")]

Date Ranges

.between() works with dates too — pass date strings in YYYY-MM-DD format.

Tables: orders

~ (NOT): Exclude Matches

Negation is surprisingly useful. Instead of listing every department you want, sometimes it’s easier to say "everything except Admin." ~ flips True to False and False to True — giving you the opposite of any condition.

~ inverts a boolean condition. It gives you everything that doesn’t match.

Python

# Everyone except Admin
techcorp_workforce[~(techcorp_workforce["department"] == "Admin")]

This is equivalent to using !=:

Python

techcorp_workforce[techcorp_workforce["department"] != "Admin"]

Both work. ~ is especially useful combined with .isin() and .str.contains(), as you’ve already seen.

Negating a Condition

Tables: techcorp_workforce

Combining & and |

Mixing AND and OR is where most bugs happen. The issue is operator precedence: & binds tighter than |, just like multiplication before addition. If you don’t add parentheses, you’ll get silently wrong results — the query runs fine, but the logic isn’t what you intended.

Let’s say you want employees in HR or Engineering, but only those earning over 80,000. You might write:

Python

# This doesn't do what you think!
techcorp_workforce[
    (techcorp_workforce["department"] == "HR") |
    (techcorp_workforce["department"] == "Engineering") &
    (techcorp_workforce["salary"] > 80000)
]

Because & is evaluated first, Python reads this as:

Python

# What Python actually evaluates:
# HR (any salary) OR (Engineering AND salary > 80000)

So you get ALL HR employees (regardless of salary) plus Engineering employees earning over 80,000. Not what you wanted.

Use Parentheses to Control Order

Tables: techcorp_workforce

When in Doubt, Use Parentheses

Parentheses make your intent clear and prevent bugs. Even if they’re not strictly necessary, they help whoever reads your code next — including future you.

High Earners in Support Departments

Table: techcorp_workforce

id	first_name	last_name	department	salary	phone_number	joining_date
1	Sarah	Mitchell	HR	95000	555-0101	2021-03-15
2	Michael	Chen	HR	88000	555-0102	2022-06-01
3	Emily	Rodriguez	HR	82500		2021-09-20
4	David	Park	HR	80000	555-0104	2023-01-10
5	Lisa	Thompson	HR	65000		2021-04-05

Tables: techcorp_workforce

Keeping It Readable

Production code gets read far more than it gets written. A filter that takes 30 seconds to decipher costs time whenever someone revisits it. These two patterns — named variables and line breaks — make your filters self-documenting.

Break conditions into variables

Python

is_hr = techcorp_workforce["department"] == "HR"
is_admin = techcorp_workforce["department"] == "Admin"
high_earner = techcorp_workforce["salary"] > 80000

techcorp_workforce[(is_hr | is_admin) & high_earner]

Naming your conditions makes complex filters much easier to read and debug.

Use backslash for line breaks

Python

techcorp_workforce[
    (techcorp_workforce["department"] == "HR") &
    (techcorp_workforce["salary"] > 80000) &
    (techcorp_workforce["joining_date"] > "2022-01-01")
]

Putting each condition on its own line with consistent indentation keeps things scannable.

Common Mistakes

Missing parentheses around conditions

Python

# Wrong: missing parentheses around conditions
techcorp_workforce[
    techcorp_workforce["department"] == "HR" &
    techcorp_workforce["salary"] > 80000
]

# Correct: each condition wrapped in parentheses
techcorp_workforce[
    (techcorp_workforce["department"] == "HR") &
    (techcorp_workforce["salary"] > 80000)
]

Without parentheses, & binds to "HR" and techcorp_workforce instead of the full conditions. This is the number one cause of errors with combined filters.

Using and/or instead of &/|

Python

# Wrong: Python keywords don't work on Series
techcorp_workforce[
    (techcorp_workforce["department"] == "HR") and
    (techcorp_workforce["salary"] > 80000)
]

# Correct: use & and | for element-wise operations
techcorp_workforce[
    (techcorp_workforce["department"] == "HR") &
    (techcorp_workforce["salary"] > 80000)
]

Python’s and/or keywords work on single True/False values. For filtering DataFrames, you need &/|, which work element-wise across entire columns.

Key Takeaways

& requires all conditions to be true (AND).
| requires at least one condition to be true (OR).
~ inverts a condition (NOT).
Every condition must be wrapped in parentheses when combining with & or |.
& is evaluated before | — use extra parentheses to control the order.
Use & and |, never Python’s and and or, when filtering DataFrames.
.isin([list]) matches multiple values; ~ inverts any filter.
.between(low, high) filters inclusive ranges — works with numbers and dates.

What’s Next

Missing values are one of the most common headaches in data analysis. In the next lesson, you'll learn what NaN really means, why comparisons break, and how to handle it correctly.

Next upContinue →

Working with Missing Data

40 min