Window Frames and Advanced Patterns
Progress Tracking
Log in to save this lesson and continue from where you left off.
Fine-Tuning Which Rows Get Included
So far, we’ve let SQL use default behavior for running calculations. But what if you need exactly the last 3 rows? Or a centered moving average? Or everything except the current row? Window frames give you that control.
This is also where interviewers test whether you really understand window functions or just memorized some patterns.
The Default Frame (This Trips People Up)
When you write:
SUM(amount) OVER (ORDER BY date)SQL actually interprets it as:
SUM(amount) OVER (
ORDER BY date
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)That means: “sum all rows from the start up to the current row.” Which is why you get a running total.
Knowing about default frame behavior shows you truly understand window functions. Many candidates don't know this — it's a way to stand out.
Window Frame Syntax
The full syntax:
function() OVER (
[PARTITION BY columns]
[ORDER BY columns]
[ROWS | RANGE] BETWEEN start AND end
)Start and end can be:
UNBOUNDED PRECEDING: the first row of the partitionn PRECEDING: n rows before currentCURRENT ROW: this rown FOLLOWING: n rows after currentUNBOUNDED FOLLOWING: the last row of the partition
ROWS vs RANGE
ROWS counts physical rows. RANGE includes all rows with the same ORDER BY value.
In practice, ROWS is clearer and more predictable. Use RANGE when you specifically need to include ties.
-- ROWS: exactly 2 rows before current
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
-- RANGE: all rows with values within 2 of current
RANGE BETWEEN 2 PRECEDING AND CURRENT ROWIf you have dates with multiple rows per date, RANGE may return more rows than you expect. ROWS is safer unless you have a specific reason.
Common Frame Patterns
Running Total (All Previous Rows)
SUM(amount) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) -- This is the default, but explicit is clearera Moving Average
SELECT
invoicedate,
description,
unitprice,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
FETCH FIRST 20 ROWS ONLY;
SELECT
invoicedate,
description,
unitprice,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
LIMIT 20;SELECT
invoicedate,
description,
unitprice,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
LIMIT 20;Centered Moving Average
AVG(amount) OVER (
ORDER BY date
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING
) -- 5-row average centered on currentEntire Partition (No Frame)
AVG(salary) OVER (PARTITION BY department) -- No ORDER BY, no frame = entire partitionSELECT
first_name,
department,
salary,
FIRST_VALUE(first_name) OVER (
PARTITION BY department
ORDER BY salary DESC
) AS top_earner
FROM employee;FIRST_VALUE gets the first row in the ordered window. In this case, the highest-paid person in each department.
With the default frame, LAST_VALUE just returns the current row (because the frame ends at current row). You almost always need ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING with LAST_VALUE.
The Consecutive Days Pattern
This one comes up in interviews a lot: “Find users with 7+ consecutive days of activity.” It’s harder than it looks.
The trick: for consecutive dates, (date - row_number) gives the same value. Non-consecutive dates give different values.
WITH numbered AS (
SELECT
user_id,
login_date,
login_date - (ROW_NUMBER() OVER (
PARTITION BY user_id
ORDER BY login_date
))::int AS streak_group
FROM user_logins
)
SELECT
user_id,
MIN(login_date) AS streak_start,
MAX(login_date) AS streak_end,
COUNT() AS streak_length
FROM numbered
GROUP BY user_id, streak_group
HAVING COUNT() >= 7;If a user logged in on Jan 1, 2, 3, then skipped Jan 4, then logged in Jan 5, 6:
- Jan 1 (row 1): Jan 1 - 1 = Dec 31 → streak_group = Dec 31
- Jan 2 (row 2): Jan 2 - 2 = Dec 31 → same streak_group!
- Jan 3 (row 3): Jan 3 - 3 = Dec 31 → same streak_group!
- Jan 5 (row 4): Jan 5 - 4 = Jan 1 → different streak_group
- Jan 6 (row 5): Jan 6 - 5 = Jan 1 → same streak_group as Jan 5
The consecutive days have the same streak_group. The break creates a new one.
The consecutive streak pattern is elegant and non-obvious. Practice it until it feels natural.
Interview Pattern Cheat Sheet
- Top N per group:
RANK/ROW_NUMBER+PARTITION BY, filter in outer query - Running total:
SUM() OVER (ORDER BY …) - Period-over-period:
LAG()for previous value, calculate difference - Moving average:
AVG() OVER (ROWS BETWEEN n PRECEDING AND CURRENT ROW) - Percentile:
NTILE(100)orPERCENT_RANK() - Consecutive streaks:
ROW_NUMBER()+ date arithmetic
Advanced Patterns
Combine DENSE_RANK() with multiple CTEs to compare rankings across months.
| user_id | created_at | number_of_comments |
|---|---|---|
| 18 | 2019-12-29 | 1 |
| 25 | 2019-12-21 | 1 |
| 78 | 2020-01-04 | 1 |
| 37 | 2020-02-01 | 1 |
| 41 | 2019-12-23 | 1 |
| user_id | name | status | country |
|---|---|---|---|
| 33 | Amanda Leon | open | Australia |
| 27 | Jessica Farrell | open | Luxembourg |
| 18 | Wanda Ramirez | open | USA |
| 50 | Samuel Miller | closed | Brazil |
| 16 | Jacob York | open | Australia |
Compare the total number of comments made by users in each country during December 2019 and January 2020. For each month, rank countries by their total number of comments in descending order. Countries with the same total should share the same rank, and the next rank should increase by one (without skipping numbers). Return the names of the countries whose rank improved from December to January (that is, their rank number became smaller).
Top Items per Category
Use RANK() with PARTITION BY to find top items in each category.
| invoiceno | stockcode | description | quantity | invoicedate | unitprice | customerid | country |
|---|---|---|---|---|---|---|---|
| 544586 | 21890 | S/6 WOODEN SKITTLES IN COTTON BAG | 3 | 2011-02-21 | 2.95 | 17338 | United Kingdom |
| 541104 | 84509G | SET OF 4 FAIRY CAKE PLACEMATS | 3 | 2011-01-13 | 3.29 | United Kingdom | |
| 560772 | 22499 | WOODEN UNION JACK BUNTING | 3 | 2011-07-20 | 4.96 | United Kingdom | |
| 555150 | 22488 | NATURAL SLATE RECTANGLE CHALKBOARD | 5 | 2011-05-31 | 3.29 | United Kingdom | |
| 570521 | 21625 | VINTAGE UNION JACK APRON | 3 | 2011-10-11 | 6.95 | 12371 | Switzerland |
Find the best-selling item for each month (no need to separate months by year). The best-selling item is determined by the highest total sales amount, calculated as: `total_paid = unitprice * quantity`. A negative `quantity` indicates a return or cancellation (the invoice number begins with `'C'`. To calculate sales, ignore returns and cancellations. Output the month, description of the item, and the total amount paid.
Consecutive Streaks
Apply the row_number + date difference technique. This is the streak pattern.
| record_date | account_id | user_id |
|---|---|---|
| 2021-01-01 | A1 | U1 |
| 2021-01-01 | A1 | U2 |
| 2021-01-06 | A1 | U3 |
| 2021-01-02 | A1 | U1 |
| 2020-12-24 | A1 | U2 |
Find all the users who were active for 3 consecutive days or more.
Running Total vs Moving Average
SELECT
invoicedate,
unitprice,
SUM(unitprice) OVER (ORDER BY invoicedate) AS running_total,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS moving_avg_5
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
FETCH FIRST 25 ROWS ONLY;
SELECT
invoicedate,
unitprice,
SUM(unitprice) OVER (ORDER BY invoicedate) AS running_total,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS moving_avg_5
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
LIMIT 25;SELECT
invoicedate,
unitprice,
SUM(unitprice) OVER (ORDER BY invoicedate) AS running_total,
AVG(unitprice) OVER (
ORDER BY invoicedate
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) AS moving_avg_5
FROM online_retail
WHERE quantity > 0
ORDER BY invoicedate
LIMIT 25;Key Takeaways
- Window frames control exactly which rows are included
ROWScounts physical rows;RANGEincludes ties- Default frame is
RANGE BETWEENUNBOUNDED PRECEDINGANDCURRENT ROW LAST_VALUEneeds explicitUNBOUNDED FOLLOWINGto see whole partition- Consecutive streak detection uses
ROW_NUMBER+ date arithmetic - These patterns come up regularly in interviews and daily analytics work.
Your learning journey starts here
Complete lessons to track your progress through the path.
What You Can Do Now
- Filter, sort, group, and aggregate data across any table structure
- Write joins, subqueries, and CTEs to answer multi-table questions
- Manipulate dates, strings, and types across PostgreSQL, MySQL, SQL Server, and Oracle
- Build ranked leaderboards, running totals, and period-over-period comparisons
- Use window frames, LAG/LEAD, and moving averages for advanced analytics
- Detect consecutive streaks and deduplicate records with window patterns
Where to Go from Here
Practice is what turns knowledge into fluency. StrataScratch has hundreds of SQL questions from real company interviews — start with the ones tagged at your level and work up. If you haven’t already, try the Python learning path as well. Most data roles expect both, and the concepts map closely: GROUP BY is groupby, JOIN is merge, window functions are transform and rank. Knowing both makes you faster in each.
The best next step is a real project. StrataScratch Data Projects give you guided, end-to-end analyses on real datasets — pick one that interests you and put your skills to work. That’s where learning becomes craft.