Cross Merge
Progress Tracking
Log in to save this lesson and continue from where you left off.
A cross merge pairs every row from the left with every row from the right. If left has 5 rows and right has 10, you get 50 rows. This sounds useless until you need it: generating all possible product-store combinations for inventory planning, creating a grid of dates and categories for a report template, or building a comparison matrix. It’s rare but powerful.
small = transportation_numbers[transportation_numbers.index < 3]
pd.merge(small, small, how="cross")With 3 rows on each side, you get 3 × 3 = 9 rows.
how="cross" can destroy your performance. 1,000 rows × 1,000 rows = 1 million rows. Only use it on small DataFrames or with a post-merge filter.
Legitimate Use Cases
Cross merge is rare but useful for:
- Date scaffolds: every date × every product (then left merge actual sales)
- Attribute combinations: every size × every color for SKU generation
- Comparison matrices: every pair of items for similarity scoring
Maximum of Two Numbers
| number |
|---|
| -2 |
| -1 |
| 0 |
| 1 |
| 2 |
Given a single column of numbers, consider all possible permutations of two numbers with replacement, assuming that pairs of numbers (x,y) and (y,x) are two different permutations. Then, for each permutation, find the maximum of the two numbers. Output three columns: the first number, the second number and the maximum of the two.
Key Takeaways
pd.merge(df_a, df_b, how="cross")produces all combinations.- Result size = rows in A × rows in B.
- Use for scaffolds and attribute combinations, never on large DataFrames without filtering.
What’s Next
Sometimes you need to merge a DataFrame with itself. Finding employee-manager pairs, comparing rows within the same dataset — that’s the self-merge.