SQL CTEs: Usage, Advantages, and Drawbacks

What is a SQL CTE


All about SQL CTEs or Common Table Expressions, their types, and usage. We will also discuss the advantages and disadvantages of using them in your SQL queries.

When working with immense data in SQL, the complexity of the queries also increases proportionally. To be able to make head or tail of it, it helps to create temporary tables within the query to which you can refer, and that’s exactly what Common Table Expressions do. In this article, we will explore what SQL Common Table Expressions (CTE) are, how they function and we will enlist their advantages and disadvantages.

What is a Common Table Expression (CTE) in SQL?

SQL CTEs or Common Table Expressions are nothing but temporary tables that you can use within a query. A CTE is defined within the context of a single query. These temporary tables have their own records and columns that get created during the execution and are cleared after execution. They can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. Most importantly, using CTEs help create neater queries with improved performance.

Although CTEs and subqueries are alike, the former is way more useful and flexible. Before CTEs were introduced in SQL:2003, subqueries were the only option. But subqueries make the queries pretty convoluted and hard to understand or visualize. Thanks to CTEs, we can now name and reference the subqueries as and when needed in the main query.

Syntax of Common Table Expressions (CTE) in SQL

Syntax of Common Table Expressions or CTE in SQL

Without much ado, let us get right into the syntax of SQL Common Table Expression (CTE).

WITH cte_name (column_1, column_2, column_3,...)   
AS (subquery)     

SELECT * 
FROM cte_name;

Firstly, the ‘WITH’ keyword is used to define the CTE, and then the name of the CTE is declared. Column names can be mentioned along with the cte_name, but they are optional. Secondly, the ‘AS’ keyword is used to particularly mention the subquery that will go on to generate the temporary result set. Finally, the SQL CTE is ready to be referred to in the main query or any of the other subqueries.

Recursive CTE in SQL

Here’s a type of CTE that references itself, repeatedly. Here’s the syntax for a recursive CTE:

WITH expression_name (column_list)
AS
(
    -- Anchor member
    initial_query  
    UNION ALL
    -- Recursive member that references expression_name.
    recursive_query  
)
-- references expression name
SELECT *
FROM   expression_name

When working with recursive CTEs in SQL, the query is divided into two components: the base case and the recursive case. The base case represents the first query that returns the initial set of results. The recursive case, on the other hand, is the query that is run repeatedly until the desired outcome is achieved.

For the recursive CTE to function, it must have an anchor member representing the initial query, and a recursive member representing the query that is executed repeatedly. To perform the recursive operation, the recursive member needs to refer back to the CTE itself.

Recursive CTE in SQL

Recursive CTEs come in pretty handy in hierarchical situations where each of the records has a parent-child relationship with another record. To ensure that the recursion stops once the desired result set is retrieved, a termination condition can be included in the WHERE clause of the recursive member in order to filter out the records that fail the termination condition.

They can be used to perform various operations, such as generating reports, calculating totals, and traversing complex data structures. However, it is important to use them judiciously, as they can be resource-intensive and can slow down performance if used improperly.

How SQL CTEs can help you simplify your work

How SQL CTEs can help you simplify your work

One thing that you can be sure of when using a SQL CTE in your query is reduced complexity. The queries gain structure and make it easy for you to visualize the data in these virtual tables as they flow through the query in its entirety. They help you organize the code by splitting the query into bite-sized logical chunks that make it easier for you or anyone else working on or maintaining your query to understand.

Examples of Common Table Expressions (CTEs) in SQL

Here are a few examples of Common Table Expressions in SQL. Let’s examine a simple example of a CTE in a non-recursive use-case:


Table: boi_transactions

Link to the question: https://platform.stratascratch.com/coding/2143-invalid-bank-transactions

Let’s examine this query.

WITH cte AS(
    SELECT 
        *,
        TO_CHAR(time_stamp, 'day') AS day_of_week,
        TO_CHAR(time_stamp, 'DD') AS day,
        CAST(time_stamp AS time) AS time
    FROM boi_transactions
    WHERE TO_CHAR(time_stamp, 'YYYY-MM') = '2022-12')
    
SELECT
    transaction_id
FROM cte
WHERE day_of_week LIKE any(array['saturday%', 'sunday%']) OR NOT time BETWEEN '09:00:00' AND '16:00:00' OR day BETWEEN '25' AND '26'

All required columns and the first 5 rows of the solution are shown

transaction_id
1095
1129
1130
1131
1132

In this question, given the bank’s normal working hours and public holidays, we are asked to detect the invalid transactions from the month of December 2022. Here, an invalid transaction is defined as that which occurs outside normal working hours.

We are only provided with one dataset, namely, boi_transactions that contains transaction IDs and corresponding timestamps. To solve this problem, we are segmenting the timestamp into day_of_week, day, and time in order to filter only the timestamps that occur within December 2022. For this operation, we use a CTE to store the timestamp segments in these temporary columns.

From this CTE or temporary result set, we write another WHERE clause in the main query that will filter out the holidays and select transactions that occurred outside of the working hours using time, day, and day_of_week columns.

Can we use multiple CTEs in the same SQL query?

Absolutely! The more complex the query, the better the idea to use multiple SQL CTEs. Multiple CTEs can be defined by separating them using commas so that they can be referenced in the main query as required.

Let us take a look at how to define multiple CTEs in the same query and join them in the main query.

WITH CTE_1 AS
( SELECT ...
	FROM ...
	WHERE ...),

CTE_2 AS
(SELECT ...
FROM ...
WHERE ...)

SELECT ...
FROM CTE1
JOIN CTE2
ON ...
WHERE ...

It is worth noting that multiple CTEs can be used in a single line of the main query as long as you define them before referencing them.

Check out this example with multiple queries:


Tables: user_flags, flag_review

Link to the question: https://platform.stratascratch.com/coding/2105-videos-removed-on-latest-date

Let us look into the solution to this question.

WITH latest_dates AS
  (SELECT user_firstname,
          user_lastname,
          max(reviewed_date) latest_date
   FROM user_flags AS uf
   INNER JOIN flag_review AS fr ON uf.flag_id = fr.flag_id
   WHERE reviewed_by_yt
   GROUP BY 1,
            2),            
n_removed_by_date AS
  (SELECT reviewed_date,
          count(DISTINCT video_id) n_removed
   FROM user_flags AS uf
   INNER JOIN flag_review AS fr ON uf.flag_id = fr.flag_id
   WHERE LOWER(reviewed_outcome) = 'removed'
   GROUP BY reviewed_date)
   
SELECT ld.*,
       coalesce(nr.n_removed, 0) n_removed
FROM latest_dates ld
LEFT JOIN n_removed_by_date nr ON ld.latest_date = nr.reviewed_date

All required columns and the first 5 rows of the solution are shown

user_firstnameuser_lastnamelatest_daten_removed
William Kwan2022-03-141
DanielBell2022-03-161
GinaKorman2022-03-172
RichardHasson2022-03-180
EvelynJohnson2022-03-172

In this example, we have created two CTEs, named latest_dates and n_removed_by_date. Each of these SQL CTEs provides us with the required data that we will reference and use in the main query. In this case, it is the latest_date column from the latest_dates CTE and the n_removed from the n_removed_by_date CTE that is required for our final output.

Practice more such real life questions here “SQL Interview Questions” and “SQL Scenario Based Interview Questions”.

Advantages and Drawbacks of SQL CTEs:

 Advantages and Drawbacks of SQL CTEs

SQL Common Table Expressions (CTEs) have a number of advantages that make them an increasingly popular feature of modern databases.

One of the biggest benefits of SQL CTEs is their reusability. Unlike subqueries, CTEs can be used multiple times within the same query, simplifying complex queries and improving query performance. By breaking up queries into smaller, logical pieces, SQL CTEs can also enhance readability, making code easier to understand and maintain. This can reduce the risk of SQL coding errors and improve overall code quality.

Memory consumption is another benefit, as CTE results are not stored in memory – they are available only after running the CTE.

While SQL CTEs are now a standard feature in many modern databases, they are not universally supported by all database platforms. Novice developers may also find CTEs more difficult to write and understand than traditional SQL queries due to their syntax complexity.

Another drawback is that the CTE result can’t be used by another SQL statement.

Overall, CTEs can simplify complex queries, improve query performance, enhance code readability, and reduce memory consumption. However, the potential drawbacks of SQL CTEs include the result not being reusable in another statement, limited support, and syntax complexity. It is important to carefully consider the specific needs of a project before deciding whether to use CTEs in database queries.

Summary

We dove into the uses, pros, and cons of Common Table Expressions (CTEs) in SQL. While using CTEs in SQL may reduce the performance of the query, the pros far outweigh the cons. Make your queries neat, readable, and reusable with the effective use of CTEs, and watch the quality of your queries transform to the next level.

Find more technical concepts, like CTEs, here “SQL Cheat Sheet”.

What is a SQL CTE


Become a data expert. Subscribe to our newsletter.