Module 1: DataFrame Fundamentals30 min

Sorting and Limiting Results

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

"Show me the top 5" and "who was hired most recently" — these are among the most common requests in analytics. Sorting and limiting are the tools behind every leaderboard, every "top N" report, and every "most recent" query. They’re simple individually, but the real power is in chaining them: filter first, then sort, then take the top N.

Sorting with .sort_values()

Python
# Ascending (default) — lowest first
orders.sort_values("total_order_cost")

# Descending — highest first
orders.sort_values("total_order_cost", ascending=False)
Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
1
Sort by Salary

Sort `techcorp_workforce` by `salary` from lowest to highest.

Tables: techcorp_workforce
Sorting and the StrataScratch Platform

The StrataScratch platform checks solutions by comparing rows, not their order. Even when a question says 'sort by salary descending,' your answer passes with any row order. Still, sort your output and scan it — it's the easiest way to spot mistakes in your logic.

Sorting Dates

Sorting by date works as long as the column is a proper datetime type — if dates sort oddly, convert with pd.to_datetime() first.

2
Find the Most Recently Hired

Sort employees to show the most recently hired first.

Tables: techcorp_workforce

Sorting by Multiple Columns

Python
orders.sort_values(
    ["cust_id", "total_order_cost"],
    ascending=[True, False]
)
3
Sort by Department Then Salary

Sort `techcorp_workforce` by `department` ascending and `salary` descending.

Tables: techcorp_workforce
Each Column Gets Its Own Sort Direction

Pass a list to ascending matching the column order: ascending=[True, False] means first column A–Z, second column highest-first.

Filtering Then Sorting

Table: orders
idcust_idorder_dateorder_detailstotal_order_cost
132019-03-04Coat100
232019-03-01Shoes80
332019-03-07Skirt30
472019-02-01Coat25
572019-03-10Shoes80
4
Filter and Sort Orders

Find all orders with a total cost over 50 and sort them from most to least expensive.

Tables: orders

Sorting by Computed Values

Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
5
Sort by Name Length

Create a column with the length of each employee’s first name, then sort by it (longest first).

Tables: techcorp_workforce
Sort Without Creating a Column

For one-off sorting: df.sort_values(key=lambda c: c.str.len()). But creating a named column is usually clearer.

Limiting Results

Once your data is sorted the way you want, you usually don’t need all of it. .head(n) grabs the first N rows from the sorted result. This is the pandas equivalent of SQL’s LIMIT.

Python
# First 5 rows
df.head(5)

# Last 3 rows
df.tail(3)

# The full pattern: sort then limit
df.sort_values("salary", ascending=False).head(5)

Shortcuts: .nlargest() and .nsmallest()

Python
# These are equivalent:
df.sort_values("salary", ascending=False).head(5)
df.nlargest(5, "salary")

# And these:
df.sort_values("salary").head(3)
df.nsmallest(3, "salary")

Bottom-N with .nsmallest()

Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
6
Find the Lowest Earners

Find the 3 lowest-paid employees.

Tables: techcorp_workforce
When to Use .nlargest() vs. sort + head

.nlargest() is faster on large DataFrames (partial sort). Use sort_values().head() when you need a specific sort order or multiple columns.

Top N from Another Table

Table: orders
idcust_idorder_dateorder_detailstotal_order_cost
132019-03-04Coat100
232019-03-01Shoes80
332019-03-07Skirt30
472019-02-01Coat25
572019-03-10Shoes80
7
Top 3 Most Expensive Orders

Find the 3 most expensive orders.

Tables: orders

Combining Filter and Top N

8
Top Orders for a Customer

Find the 3 highest-value orders placed by customer 15.

Tables: orders

The Full Pattern

In practice, you almost never sort alone. The real pattern is a chain: filter to the rows you care about, sort by the column that matters, limit to the top N, and select only the columns you need. This chain is the backbone of exploratory analysis.

Python
# Filter → Sort → Limit → Select columns
(
    df[df["department"] == "Engineering"]
    .sort_values("joining_date", ascending=False)
    .head(5)
    [["first_name", "last_name", "joining_date"]]
)

The Full Pipeline

Table: techcorp_workforce
idfirst_namelast_namedepartmentsalaryphone_numberjoining_date
1SarahMitchellHR95000555-01012021-03-15
2MichaelChenHR88000555-01022022-06-01
3EmilyRodriguezHR825002021-09-20
4DavidParkHR80000555-01042023-01-10
5LisaThompsonHR650002021-04-05
9
Putting It All Together

Find the 5 most recently hired employees in the Engineering department. Show their first name, last name, and joining date.

Tables: techcorp_workforce

Key Takeaways

  • .sort_values("col") sorts ascending; ascending=False for descending.
  • Pass lists for multi-column sorting with independent directions.
  • .head(n) limits to first N rows; .tail(n) for last N.
  • .nlargest(n, col) / .nsmallest(n, col) — faster shortcuts for top/bottom N.
  • The full pattern: filter → sort → limit → select columns.

What’s Next

You can now select columns, filter rows, combine conditions, handle missing data, sort, and limit. That’s the full toolkit for working with a single DataFrame. Module 2 introduces aggregation: counting, summing, averaging, and grouping — the foundation of every analytical report.