Date Arithmetic and Truncation
Progress Tracking
Log in to save this lesson and continue from where you left off.
Calculating with Dates
Extracting parts is just the beginning. Real analytics requires calculating differences (“days since last purchase”), adding intervals (“30 days from now”), and grouping by periods (“weekly revenue”). These operations power cohort analysis, retention metrics, and every time-series report you’ll ever build.
Dialect Comparison: Date Arithmetic
This is where dialects diverge dramatically. Know your target:
- Add 7 days:
- PostgreSQL uses
d + INTERVAL '7 days' - MySQL uses
DATE_ADD(d, INTERVAL 7 DAY) - MSSQL uses
DATEADD(day, 7, d) - Oracle uses
d + 7
- PostgreSQL uses
- Add 1 month:
- PostgreSQL uses
d + INTERVAL '1 month' - MySQL uses
DATE_ADD(d, INTERVAL 1 MONTH) - MSSQL uses
DATEADD(month, 1, d) - Oracle uses
ADD_MONTHS(d, 1)
- PostgreSQL uses
- Subtract 30 days:
- PostgreSQL uses
d - INTERVAL '30 days' - MySQL uses
DATE_SUB(d, INTERVAL 30 DAY) - MSSQL uses
DATEADD(day, -30, d) - Oracle uses
d - 30
- PostgreSQL uses
- Add 1 year:
- PostgreSQL uses
d + INTERVAL '1 year' - MySQL uses
DATE_ADD(d, INTERVAL 1 YEAR) - MSSQL uses
DATEADD(year, 1, d) - Oracle uses
ADD_MONTHS(d, 12)
- PostgreSQL uses
Oracle treats dates as numbers where 1 = 1 day. So d + 7 adds 7 days. Elegant for days, but you need ADD_MONTHS for months since they vary in length.
Calculating Date Differences
Questions like “How many days since last purchase?” and “What’s the average time between orders?” come up constantly. PostgreSQL makes it simple:
-- PostgreSQL: subtract dates directly
SELECT
order_date,
CURRENT_DATE - order_date AS days_ago
FROM orders;
-- Days between signup and first order
SELECT
u.id,
MIN(o.order_date) - u.joining_date AS days_to_first_order
FROM users u
JOIN orders o
ON u.id = o.user_id
GROUP BY u.id, u.joining_date;Dialect Comparison: Date Differences
- Days between dates:
- PostgreSQL uses
d1 - d2 - MySQL uses
DATEDIFF(d1, d2) - MSSQL uses
DATEDIFF(day, d2, d1) - Oracle uses
d1 - d2
- PostgreSQL uses
- Months between:
- PostgreSQL uses
AGE(d1, d2) - MySQL uses
TIMESTAMPDIFF(MONTH, d2, d1) - MSSQL uses
DATEDIFF(month, d2, d1) - Oracle uses
MONTHS_BETWEEN(d1, d2)
- PostgreSQL uses
MySQL’s DATEDIFF(later, earlier) returns positive. MSSQL’s DATEDIFF(unit, earlier, later) also returns positive. Mixing these up is a classic bug — test with known dates.
Calculate how many days each employee has worked at the company. Subtract their joining_date from today.
Truncating Dates with DATE_TRUNC
DATE_TRUNC is essential for time-series analysis. It rounds a date down to the start of a period — the first day of the month, the Monday of that week, the first of the quarter.
-- Truncate to month start
SELECT
DATE_TRUNC('month', order_date) AS month_start
FROM orders;
-- Example:
-- Input: 2024-03-15
-- Output: 2024-03-01 (first of the month)Common truncation periods:
- ‘day’ — Start of day (strips the time component from timestamps)
- ‘week’ — Start of week (Monday in PostgreSQL by default)
- ‘month’ — First day of month
- ‘quarter’ — First day of quarter (Jan 1, Apr 1, Jul 1, Oct 1)
- ‘year’ — First day of year
Grouping by Time Period
DATE_TRUNC’s real power is in GROUP BY for time-series aggregation:
Group employees by the month they joined. Truncate joining_date to month start, count hires per month.
Dialect Comparison: Date Truncation
- Truncate to month:
- PostgreSQL uses
DATE_TRUNC('month', d) - MySQL uses
DATE_FORMAT(d, '%Y-%m-01') - MSSQL uses
DATETRUNC(month, d) - Oracle uses
TRUNC(d, 'MM')
- PostgreSQL uses
- Truncate to week:
- PostgreSQL uses
DATE_TRUNC('week', d) - MySQL uses
DATE(d - INTERVAL WEEKDAY(d) DAY) - MSSQL uses
DATETRUNC(week, d) - Oracle uses
TRUNC(d, 'IW')
- PostgreSQL uses
- Truncate to year:
- PostgreSQL uses
DATE_TRUNC('year', d) - MySQL uses
DATE_FORMAT(d, '%Y-01-01') - MSSQL uses
DATETRUNC(year, d) - Oracle uses
TRUNC(d, 'YYYY')
- PostgreSQL uses
MySQL doesn’t have DATE_TRUNC — you’ll need DATE_FORMAT workarounds or date arithmetic. MSSQL added DATETRUNC in SQL Server 2022; older versions need DATEADD/DATEDIFF tricks. This is one of the biggest dialect gaps.
Number of Shipments Per Month
Use DATE_TRUNC('month') or TO_CHAR(date, 'YYYY-MM') to group shipments by month, then COUNT.
| shipment_id | sub_id | weight | shipment_date |
|---|---|---|---|
| 101 | 1 | 10 | 2021-08-30 |
| 101 | 2 | 20 | 2021-09-01 |
| 101 | 3 | 10 | 2021-09-05 |
| 102 | 1 | 50 | 2021-09-02 |
| 103 | 1 | 25 | 2021-09-01 |
Write a query that will calculate the number of shipments per month. The unique key for one shipment is a combination of `shipment_id` and `sub_id`. Output the `year_month` in format `YYYY-MM` and the number of shipments in that month.
Combining Date Functions
Real analysis chains multiple operations. Here’s a pattern for calculating average session time:
-- Step 1: Calculate duration per session
-- Step 2: Average by user
WITH session_durations AS (
SELECT
user_id,
session_id,
MAX(timestamp) - MIN(timestamp) AS duration
FROM page_loads
GROUP BY user_id, session_id
)
SELECT
user_id,
AVG(duration) AS avg_session_time
FROM session_durations
GROUP BY user_id;Users By Average Session Time
Calculate the time difference between page loads per session, then average by user. A CTE makes this cleaner.
| user_id | timestamp | action |
|---|---|---|
| 0 | 2019-04-25 13:30:15 | page_load |
| 0 | 2019-04-25 13:30:18 | page_load |
| 0 | 2019-04-25 13:30:40 | scroll_down |
| 0 | 2019-04-25 13:30:45 | scroll_up |
| 0 | 2019-04-25 13:31:10 | scroll_down |
Calculate each user's average session time, where a session is defined as the time difference between a `page_load` and a `page_exit`. Assume each user has only one session per day. If there are multiple `page_load` or `page_exit` events on the same day, use only the latest `page_load` and the earliest `page_exit`. Only consider sessions where the `page_load` occurs before the `page_exit` on the same day. Output the `user_id` and their average session time.
Filtering by Date Range
Combine date arithmetic with filtering for dynamic ranges:
-- Orders from the last 30 days
SELECT *
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';
-- This month's data (dynamic, works any day)
SELECT *
FROM events
WHERE event_date >= DATE_TRUNC('month', CURRENT_DATE);
-- Last complete month (useful for monthly reports)
SELECT *
FROM events
WHERE event_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND event_date < DATE_TRUNC('month', CURRENT_DATE);For reports, prefer “last complete month” over “last 30 days.” The numbers are reproducible and don’t shift daily.
Key Takeaways
- Use
INTERVALto add or subtract time from dates - Date subtraction gives you the difference in days (PostgreSQL) or use
DATEDIFF DATE_TRUNCrounds dates to period starts — essential for time-series grouping- Combine
DATE_TRUNCwithGROUP BYfor weekly/monthly/quarterly aggregations - Date arithmetic syntax varies dramatically by dialect — this is a major gap to watch
What’s Next
You’ve mastered date manipulation. Next, we’ll switch to strings: concatenation, extraction, trimming whitespace, and all the text cleanup operations you’ll need for messy real-world data.