Module 5: Date, Time & Text Functions30 min

Date Arithmetic and Truncation

Progress Tracking

Log in to save this lesson and continue from where you left off.

Log in

Calculating with Dates

Extracting parts is just the beginning. Real analytics requires calculating differences (“days since last purchase”), adding intervals (“30 days from now”), and grouping by periods (“weekly revenue”). These operations power cohort analysis, retention metrics, and every time-series report you’ll ever build.

Dialect Comparison: Date Arithmetic

This is where dialects diverge dramatically. Know your target:

  • Add 7 days:
    • PostgreSQL uses d + INTERVAL '7 days'
    • MySQL uses DATE_ADD(d, INTERVAL 7 DAY)
    • MSSQL uses DATEADD(day, 7, d)
    • Oracle uses d + 7
  • Add 1 month:
    • PostgreSQL uses d + INTERVAL '1 month'
    • MySQL uses DATE_ADD(d, INTERVAL 1 MONTH)
    • MSSQL uses DATEADD(month, 1, d)
    • Oracle uses ADD_MONTHS(d, 1)
  • Subtract 30 days:
    • PostgreSQL uses d - INTERVAL '30 days'
    • MySQL uses DATE_SUB(d, INTERVAL 30 DAY)
    • MSSQL uses DATEADD(day, -30, d)
    • Oracle uses d - 30
  • Add 1 year:
    • PostgreSQL uses d + INTERVAL '1 year'
    • MySQL uses DATE_ADD(d, INTERVAL 1 YEAR)
    • MSSQL uses DATEADD(year, 1, d)
    • Oracle uses ADD_MONTHS(d, 12)
Oracle treats dates as numbers

Oracle treats dates as numbers where 1 = 1 day. So d + 7 adds 7 days. Elegant for days, but you need ADD_MONTHS for months since they vary in length.

Calculating Date Differences

Questions like “How many days since last purchase?” and “What’s the average time between orders?” come up constantly. PostgreSQL makes it simple:

PostgreSQL
-- PostgreSQL: subtract dates directly
SELECT
  order_date,
  CURRENT_DATE - order_date AS days_ago
FROM orders;

-- Days between signup and first order
SELECT
  u.id,
  MIN(o.order_date) - u.joining_date AS days_to_first_order
FROM users u
JOIN orders o
  ON u.id = o.user_id
GROUP BY u.id, u.joining_date;

Dialect Comparison: Date Differences

  • Days between dates:
    • PostgreSQL uses d1 - d2
    • MySQL uses DATEDIFF(d1, d2)
    • MSSQL uses DATEDIFF(day, d2, d1)
    • Oracle uses d1 - d2
  • Months between:
    • PostgreSQL uses AGE(d1, d2)
    • MySQL uses TIMESTAMPDIFF(MONTH, d2, d1)
    • MSSQL uses DATEDIFF(month, d2, d1)
    • Oracle uses MONTHS_BETWEEN(d1, d2)
Watch the argument order

MySQL’s DATEDIFF(later, earlier) returns positive. MSSQL’s DATEDIFF(unit, earlier, later) also returns positive. Mixing these up is a classic bug — test with known dates.

1
Try Days Since Hire

Calculate how many days each employee has worked at the company. Subtract their joining_date from today.

Tables: worker

Truncating Dates with DATE_TRUNC

DATE_TRUNC is essential for time-series analysis. It rounds a date down to the start of a period — the first day of the month, the Monday of that week, the first of the quarter.

PostgreSQL
-- Truncate to month start
SELECT
  DATE_TRUNC('month', order_date) AS month_start
FROM orders;
-- Example:
-- Input: 2024-03-15
-- Output: 2024-03-01 (first of the month)

Common truncation periods:

  • ‘day’ — Start of day (strips the time component from timestamps)
  • ‘week’ — Start of week (Monday in PostgreSQL by default)
  • ‘month’ — First day of month
  • ‘quarter’ — First day of quarter (Jan 1, Apr 1, Jul 1, Oct 1)
  • ‘year’ — First day of year

Grouping by Time Period

DATE_TRUNC’s real power is in GROUP BY for time-series aggregation:

2
Try Monthly Hire Trends

Group employees by the month they joined. Truncate joining_date to month start, count hires per month.

Tables: worker

Dialect Comparison: Date Truncation

  • Truncate to month:
    • PostgreSQL uses DATE_TRUNC('month', d)
    • MySQL uses DATE_FORMAT(d, '%Y-%m-01')
    • MSSQL uses DATETRUNC(month, d)
    • Oracle uses TRUNC(d, 'MM')
  • Truncate to week:
    • PostgreSQL uses DATE_TRUNC('week', d)
    • MySQL uses DATE(d - INTERVAL WEEKDAY(d) DAY)
    • MSSQL uses DATETRUNC(week, d)
    • Oracle uses TRUNC(d, 'IW')
  • Truncate to year:
    • PostgreSQL uses DATE_TRUNC('year', d)
    • MySQL uses DATE_FORMAT(d, '%Y-01-01')
    • MSSQL uses DATETRUNC(year, d)
    • Oracle uses TRUNC(d, 'YYYY')
MySQL lacks DATE_TRUNC

MySQL doesn’t have DATE_TRUNC — you’ll need DATE_FORMAT workarounds or date arithmetic. MSSQL added DATETRUNC in SQL Server 2022; older versions need DATEADD/DATEDIFF tricks. This is one of the biggest dialect gaps.

Number of Shipments Per Month

Use DATE_TRUNC('month') or TO_CHAR(date, 'YYYY-MM') to group shipments by month, then COUNT.

Table: amazon_shipment
shipment_idsub_idweightshipment_date
1011102021-08-30
1012202021-09-01
1013102021-09-05
1021502021-09-02
1031252021-09-01
3
Number of Shipments Per Month
View solution

Write a query that will calculate the number of shipments per month. The unique key for one shipment is a combination of `shipment_id` and `sub_id`. Output the `year_month` in format `YYYY-MM` and the number of shipments in that month.

Tables: amazon_shipment

Combining Date Functions

Real analysis chains multiple operations. Here’s a pattern for calculating average session time:

PostgreSQL
-- Step 1: Calculate duration per session
-- Step 2: Average by user
WITH session_durations AS (
  SELECT
    user_id,
    session_id,
    MAX(timestamp) - MIN(timestamp) AS duration
  FROM page_loads
  GROUP BY user_id, session_id
)
SELECT
  user_id,
  AVG(duration) AS avg_session_time
FROM session_durations
GROUP BY user_id;

Users By Average Session Time

Calculate the time difference between page loads per session, then average by user. A CTE makes this cleaner.

Table: facebook_web_log
user_idtimestampaction
02019-04-25 13:30:15page_load
02019-04-25 13:30:18page_load
02019-04-25 13:30:40scroll_down
02019-04-25 13:30:45scroll_up
02019-04-25 13:31:10scroll_down
4
Users By Average Session Time
View solution

Calculate each user's average session time, where a session is defined as the time difference between a `page_load` and a `page_exit`. Assume each user has only one session per day. If there are multiple `page_load` or `page_exit` events on the same day, use only the latest `page_load` and the earliest `page_exit`. Only consider sessions where the `page_load` occurs before the `page_exit` on the same day. Output the `user_id` and their average session time.

Tables: facebook_web_log

Filtering by Date Range

Combine date arithmetic with filtering for dynamic ranges:

PostgreSQL
-- Orders from the last 30 days
SELECT *
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';

-- This month's data (dynamic, works any day)
SELECT *
FROM events
WHERE event_date >= DATE_TRUNC('month', CURRENT_DATE);

-- Last complete month (useful for monthly reports)
SELECT *
FROM events
WHERE event_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
  AND event_date < DATE_TRUNC('month', CURRENT_DATE);
Use complete periods for reports

For reports, prefer “last complete month” over “last 30 days.” The numbers are reproducible and don’t shift daily.

Key Takeaways

  • Use INTERVAL to add or subtract time from dates
  • Date subtraction gives you the difference in days (PostgreSQL) or use DATEDIFF
  • DATE_TRUNC rounds dates to period starts — essential for time-series grouping
  • Combine DATE_TRUNC with GROUP BY for weekly/monthly/quarterly aggregations
  • Date arithmetic syntax varies dramatically by dialect — this is a major gap to watch

What’s Next

You’ve mastered date manipulation. Next, we’ll switch to strings: concatenation, extraction, trimming whitespace, and all the text cleanup operations you’ll need for messy real-world data.