LAG and LEAD
Progress Tracking
Log in to save this lesson and continue from where you left off.
Looking Backward and Forward
- “What’s the month-over-month change?”
- “How does this compare to last period?”
- “Did the customer’s spending go up or down?”
These questions require comparing a row to the row before it (or after it). Without window functions, you’d need to self-join the table, matching each row to its predecessor. It works, but it’s clunky. LAG and LEAD make it clean.
LAG in Action
SELECT
user_id,
CAST(created_at AS DATE) AS tx_date,
LAG(CAST(created_at AS DATE)) OVER (ORDER BY created_at) AS prev_date
FROM amazon_transactions
ORDER BY created_at
FETCH FIRST 15 ROWS ONLY;
SELECT
user_id,
created_at::date AS tx_date,
LAG(created_at::date) OVER (ORDER BY created_at) AS prev_date
FROM amazon_transactions
ORDER BY created_at
LIMIT 15;SELECT
user_id,
created_at::date AS tx_date,
LAG(created_at::date) OVER (ORDER BY created_at) AS prev_date
FROM amazon_transactions
ORDER BY created_at
LIMIT 15;The first row has NULL for prev_date because there’s no row before it. That’s expected, and you’ll need to handle it in your calculations.
The Full LAG Syntax
LAG(column, offset, default) OVER (ORDER BY …)
-- column: what value to grab
-- offset: how many rows back (default is 1)
-- default: what to return if there’s no previous row (default is NULL)So LAG(revenue) is shorthand for LAG(revenue, 1, NULL). Looking back 1 row, returning NULL if there’s nothing there.
LAG(revenue) – Previous row’s revenue, NULL if none
LAG(revenue, 1) – Same as above
LAG(revenue, 3) – 3 rows back
LAG(revenue, 1, 0) – Previous row, default to 0 instead of NULLLEAD: Getting the Next Row’s Value
LEAD() is the mirror image of LAG. It looks forward instead of backward.
SELECT
month,
revenue,
LEAD(revenue) OVER (ORDER BY month) AS next_month
FROM monthly_sales;monthJan,revenue100,next_month150monthFeb,revenue150,next_month200monthMar,revenue200,next_month180monthApr,revenue180,next_monthNULL
Now April has NULL because there’s no row after it.
LEAD is less common than LAG in practice. Most business questions are backward-looking: “compared to last month” rather than “compared to next month.” But it’s there when you need it.
Calculating Period-Over-Period Change
This is the bread and butter of LAG. Once you have the previous value, you can calculate change.
Absolute Change
SELECT
month,
revenue,
revenue - LAG(revenue) OVER (ORDER BY month) AS change
FROM monthly_sales;Period-Over-Period Change
WITH user_purchases AS (
SELECT
user_id,
CAST(created_at AS DATE) AS tx_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at) AS purchase_num
FROM amazon_transactions
)
SELECT
user_id,
tx_date,
purchase_num,
LAG(tx_date) OVER (PARTITION BY user_id ORDER BY tx_date) AS prev_date,
tx_date - LAG(tx_date) OVER (PARTITION BY user_id ORDER BY tx_date) AS days_between
FROM user_purchases
ORDER BY user_id, tx_date;
WITH user_purchases AS (
SELECT
user_id,
created_at::date AS tx_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at) AS purchase_num
FROM amazon_transactions
)WITH user_purchases AS (
SELECT
user_id,
created_at::date AS tx_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at) AS purchase_num
FROM amazon_transactions
)
SELECT
user_id,
tx_date,
purchase_num,
LAG(tx_date) OVER (PARTITION BY user_id ORDER BY tx_date) AS prev_date,
tx_date - LAG(tx_date) OVER (PARTITION BY user_id ORDER BY tx_date) AS days_between
FROM user_purchases
ORDER BY user_id, tx_date;That first NULL is going to show up in your output. Decide upfront whether to filter it out, replace it with 0, or leave it as NULL.
LAG with PARTITION BY
Add PARTITION BY when you need to compare within groups:
SELECT
region,
month,
revenue,
revenue - LAG(revenue) OVER (
PARTITION BY region
ORDER BY month
) AS mom_change
FROM regional_sales;Now each region has its own LAG sequence. East compares to East. West compares to West. You won’t accidentally compare East’s March to West’s February.
Adding Days Between Purchases
SELECT
user_id,
CAST(created_at AS DATE) AS tx_date,
LAG(CAST(created_at AS DATE)) OVER (
PARTITION BY user_id
ORDER BY created_at
) AS prev_tx_date
FROM amazon_transactions
ORDER BY user_id, created_at;
SELECT
user_id,
created_at::date AS tx_date,
LAG(created_at::date) OVER (
PARTITION BY user_id
ORDER BY created_at
) AS prev_tx_date
FROM amazon_transactions
ORDER BY user_id, created_at;Looking Further Back
The offset parameter lets you compare to any previous period:
-- Same month last year
LAG(revenue, 12) OVER (ORDER BY month) AS same_month_ly
-- Same quarter last year
LAG(revenue, 4) OVER (ORDER BY quarter) AS same_quarter_lyYear-over-year comparisons are more meaningful than month-over-month for seasonal businesses. A swimsuit company’s March-to-February comparison doesn’t tell you much, but March-to-March does.
Handling NULLs
The first row (or first N rows if offset > 1) will have NULL. Here are your options:
-- Option 1: Filter them out
WHERE LAG(revenue) OVER (ORDER BY month) IS NOT NULL
-- Option 2: Use a default value
LAG(revenue, 1, 0) OVER (ORDER BY month)
-- Option 3: COALESCE for calculated fields
COALESCE(revenue - LAG(revenue) OVER (…), 0) AS changeIf LAG returns 0 or NULL, you'll get a division error. Use NULLIF to handle divide-by-zero: revenue / NULLIF(LAG(revenue) OVER (...), 0).
Finding Purchases
Use LAG() with PARTITION BY to calculate time between purchases for each user.
| id | user_id | item | created_at | revenue |
|---|---|---|---|---|
| 1 | 109 | milk | 2020-03-03 | 123 |
| 2 | 139 | biscuit | 2020-03-18 | 421 |
| 3 | 120 | milk | 2020-03-18 | 176 |
| 4 | 108 | banana | 2020-03-18 | 862 |
| 5 | 130 | milk | 2020-03-28 | 333 |
Identify returning active users by finding users who made a repeat purchase within 7 days or less of their previous transaction, excluding same-day purchases. Output a list of these `user_id`.
Calculate month-over-month percentage change. Watch out for that first NULL row.
| id | created_at | value | purchase_id |
|---|---|---|---|
| 1 | 2019-01-01 | 172692 | 43 |
| 2 | 2019-01-05 | 177194 | 36 |
| 3 | 2019-01-09 | 109513 | 30 |
| 4 | 2019-01-13 | 164911 | 30 |
| 5 | 2019-01-17 | 198872 | 39 |
Given a table of purchases by date, calculate the month-over-month percentage change in revenue. The output should include the year-month date (YYYY-MM) and percentage change, rounded to the 2nd decimal point, and sorted from the beginning of the year to the end of the year. The percentage change column will be populated from the 2nd month forward and can be calculated as ((this month's revenue - last month's revenue) / last month's revenue)*100.
Key Takeaways
LAG()gets the previous row’s value;LEAD()gets the next row’s- The offset parameter (default 1) controls how far back/forward to look
- The default parameter (default
NULL) controls what to return at edges PARTITION BYcreates separate sequences for each group- Handle
NULLsexplicitly. Don’t let them surprise you.
What’s Next
You’ve learned ranking and offset functions. Now let’s see how regular aggregates like SUM and AVG work as window functions. Running totals and moving averages are incredibly useful, and the syntax is simpler than you might think.