Module 3: Combining DataFrames5 min

Outer Merge

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

An outer merge keeps everything from both sides. Unmatched rows get NaN where the other side has no data. This is useful for gap analysis: “which products have reviews but no sales, and which have sales but no reviews?” In practice, outer merges are rarer than inner or left — but when you need them, nothing else works.

Keeping Everything from Both Sides

how="outer" is like running left and right merge simultaneously. You get all rows from both DataFrames: matches where they exist, NaN where they don’t.

indicator=True makes outer merge especially powerful. The _merge column instantly tells you: both (matched), left_only (customer with no orders), right_only (order with no valid customer).

Interview Pattern

“Find all unmatched records from both tables.” Use outer merge with indicator=True, then filter for left_only and right_only.

Find Unmatched Records

Table: customers
idfirst_namelast_namecityaddressphone_number
8JohnJosephSan Francisco928-386-8164
7JillMichaelAustin813-297-0692
4WilliamDanielDenver813-368-1200
5HenryJacksonMiami808-601-7513
13EmmaIsaacMiami808-690-5201
Table: orders
idcust_idorder_dateorder_detailstotal_order_cost
132019-03-04Coat100
232019-03-01Shoes80
332019-03-07Skirt30
472019-02-01Coat25
572019-03-10Shoes80
1
Find Unmatched Records

Outer merge customers with orders using the indicator, then filter to rows that exist in only one table.

Tables: customers, orders

Key Takeaways

  • how="outer" returns all rows from both DataFrames.
  • indicator=True labels each row as both, left_only, or right_only.
  • Use it for data reconciliation and gap analysis.

What’s Next

All the merges so far require matching keys. But what if you want every possible combination? That’s cross merge.