A Detailed Comparison Between Inner Join vs. Outer Joins

Comparison Between Inner Join vs Outer Joins


In this article, we compare inner join vs. outer join. We will talk about their similarities and differences and show you how to use them in practice.

Fully exploiting SQL’s ability to manage and manipulate relational databases is possible only if you know how to join tables.

This will allow you to write complex queries across multiple datasets.

It goes without saying that knowing SQL JOINs also means knowing the differences between inner and outer joins.

So, in this article, you’ll find a detailed comparison between those two distinct types of joins in SQL.

What is a Join in SQL?

JOIN is an SQL clause that combines columns from two or more tables based on a related column between them.

Without it, you’re basically limited to querying only one table, which is completely useless in most real-life situations.

There are several join types in SQL:

  • JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
  • CROSS JOIN

These are all different types of SQL JOINs you must know.

Some of those are inner and some are outer joins, which we’ll see in the following sections. But, first, let’s talk about the join syntax.

Inner Join vs. Outer Join: The Syntax

A general syntax of joins in SQL is

SELECTFROM table_1
JOIN table_2 
ON table_1.column = table_2.column;

The syntax doesn’t change whether you’re using inner or outer joins. It always has one table in the FROM clause followed by the join type keyword and the second table.

The tables are always joined using the ON clause. This clause is used for stating the joining condition, which is usually where one column from the first table equals another column from the second table.

Inner Join vs. Outer Join: Definition

What is Inner Join?

An inner join is a type of join that returns rows from both tables only when there is at least one match in the columns being joined. In other words, it outputs data whose keys are in the tables’ intersection. Therefore the name inner join.

JOIN is an inner type of join, so you can also write it as INNER JOIN in SQL.

Here’s how this type of join works.

Inner join vs Outer join

What is Outer Join?

Outer joins extend the functionality of an inner join. They do that by returning not only matching rows but also the non-matched rows. More specifically, they return all the rows from the dominant (outer) table and only the matching rows from the subordinate (inner) tables.

This means they will include the records outside the tables' intersection. That’s why they’re called outer joins.

All three SQL outer joins work on the same principle, but they still will show different outputs.

  • LEFT JOIN: Returns all rows from the left (first) table and the matched rows from the right (second) table. If there is no match, the result set will include NULLs on the side of the right table.
Left outer join vs inner join

  • RIGHT JOIN: Returns all rows from the right (second) table and the matched rows from the left (first) table. If there is no match, the result set will include NULLs on the side of the left table.
Right outer join vs inner join

  • FULL JOIN: Returns all the rows from the joined tables. It combines the results of both left and right outer joins and the unmatched rows from both tables.
Full outer join vs inner join

These all belong to the outer join family, so they can also be written in SQL as LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN.

Here’s a more detailed explanation on the LEFT JOIN vs. LEFT OUTER JOIN example.

Comparison: Inner Join vs. Outer Join

Here’s a comparison between the two types of joins based on their main features.

Comparison Between Inner Join vs Outer Joins

Inner Join vs. Outer Join: Use Cases and Examples

Inner Join

As I already mentioned, INNER JOIN is used when you need only matching rows from the joined tables.

Here’s one example.

INNER JOIN Example: Hosts' Abroad Apartments

The question by Airbnb asks you to find the number of hosts that have accommodations in countries of which they are not citizens.


Tables: airbnb_hosts, airbnb_apartments

Link to the question: https://platform.stratascratch.com/coding/10071-hosts-abroad-apartments

The dataset’s first table is airbnb_hosts. It’s a list of hosts and some details about them.

Table: airbnb_hosts
host_idnationalitygenderage
0USAM28
1USAF29
2ChinaF31
3ChinaM24
4MaliM30

The second table is airbnb_apartments. It’s a list of apartments connected with the first table via the column host_id.

Table: airbnb_apartments
host_idapartment_idapartment_typen_bedsn_bedroomscountrycity
0A1Room11USANew York
0A2Room11USANew Jersey
0A3Room11USANew Jersey
1A4Apartment21USAHouston
1A5Apartment21USALas Vegas

This problem should be solved by joining the tables with INNER JOIN. The tables are joined on two conditions. One is that the host ID is the same in both tables. The second is that the host’s nationality is different from the country where the accommodation is.

To get the number of such hosts, I use the COUNT() aggregate function and DISTINCT to count every host ID only once.

SELECT COUNT(DISTINCT h.host_id) AS number_of_hosts
FROM airbnb_hosts h
INNER JOIN airbnb_apartments a
ON h.host_id = a.host_id AND h.nationality <> a.country;

The output there are three hosts that satisfy the criteria.

All required columns and the first 5 rows of the solution are shown

count
3

INNER JOIN was necessary in solving this problem. Why not one of the outer joins? It’s because LEFT JOIN, for example, would include all the hosts, meaning even those who don’t meet the joining criteria, i.e., they don’t have accommodations in countries of which they’re not citizens.

We can check this by running the code that uses LEFT JOIN.

SELECT COUNT(DISTINCT h.host_id) AS number_of_hosts
FROM airbnb_hosts h
LEFT JOIN airbnb_apartments a
ON h.host_id = a.host_id AND h.nationality <> a.country;

This code shows there are 12 hosts with accommodations in countries in which they’re not citizens. You already know that’s not the correct answer, but feel free to click the ‘Check Solution’ button for yourself.

All required columns and the first 5 rows of the solution are shown

number_of_hosts
12

Outer Joins

The outer joins are used when you need the matching rows from both joined tables but also the non-matching rows from at least one table, if not both.

Let’s see examples for each of the outer joins in SQL.

LEFT OUTER JOIN Example: Find the Number of Employees Who Received the Bonus and Who Didn’t

The question by Dell & Microsoft asks you to solve the problem of finding the number of employees who received the bonus and who didn't. There are also some caveats here. You should use values from the bonus table. Also, one employee can receive more than one bonus.


Tables: employee, bonus

Link to the question: https://platform.stratascratch.com/coding/10081-find-the-number-of-employees-who-received-the-bonus-and-who-didnt

The first table in the dataset is a detailed list of employees named adequately, employee.

Table: employee
idfirst_namelast_nameagesexemployee_titledepartmentsalarytargetbonusemailcityaddressmanager_id
5MaxGeorge26MSalesSales1300200150Max@company.comCalifornia2638 Richards Avenue1
13KattyBond56FManagerManagement1500000300Katty@company.comArizona1
11RicherdGear57MManagerManagement2500000300Richerd@company.comAlabama1
10JenniferDion34FSalesSales1000200150Jennifer@company.comAlabama13
19GeorgeJoe50MManagerManagement1000000300George@company.comFlorida1003 Wyatt Street1

The second table is bonus.

Table: bonus
worker_ref_idbonus_amountbonus_date
150002020-02-16
230002011-06-16
340002020-02-16
145002020-02-16
235002011-06-16

Solving this question requires using LEFT JOIN. Why? Along with the number of employees who received bonuses, you also need to show the number of those who didn’t. As those employees won’t be found in the table bonus, the question virtually asks you to show all employees from the table employee.

The solution involves writing a subquery, so let me explain it first.

The subquery LEFT JOINs the table employee with the table bonus on the employee ID.

It then uses the CASE statement to make a distinction between the employees who got the bonus and those who didn’t. It does that by checking whether the bonus date is NULL. Remember, if the employee from the first table isn’t found in the second table, the value in the column will be NULL, i.e., he or she didn’t receive a bonus.

Knowing that (and reading the question’s instructions) will make CASE statement look like a nice little trick for solving the interview question.

SELECT e.id,
       CASE 
         WHEN bonus_date IS NULL THEN 0 
         ELSE 1 
       END AS has_bonus
FROM employee e
LEFT JOIN bonus b
ON e.id = b.worker_ref_id;

Here’s this soon-to-be subquery’s output snapshot.

All required columns and the first 5 rows of the solution are shown

idhas_bonus
50
130
110
100
190

Now, let’s turn this SELECT into a subquery and write the main query. It selects the column that labels data with 0 or 1, depending on the fact the bonus isn’t or is received.

The COUNT() function with DISTINCT will return the number of employees in each category.

SELECT has_bonus,
       COUNT(DISTINCT base.id) AS n_employees
FROM
    (SELECT e.id,
            CASE 
                WHEN bonus_date IS NULL THEN 0 
                ELSE 1 
            END AS has_bonus
     FROM employee e
     LEFT JOIN bonus b
     ON e.id = b.worker_ref_id) base
GROUP BY has_bonus;

The query shows there are 27 employees without a bonus and only three with.

All required columns and the first 5 rows of the solution are shown

has_bonusn_employees
027
13

If you don’t trust me that you shouldn’t use INNER JOIN here, maybe you can try it yourself. Run the code below, rewritten with INNER JOIN. You’ll see it will return only the employees who received a bonus.

SELECT has_bonus,
       COUNT(DISTINCT base.id) AS n_employees
FROM
    (SELECT e.id,
            CASE 
                WHEN bonus_date IS NULL THEN 0 
                ELSE 1 
            END AS has_bonus
     FROM employee e
     INNER JOIN bonus b
     ON e.id = b.worker_ref_id) base
GROUP BY has_bonus;

All required columns and the first 5 rows of the solution are shown

has_bonusn_employees
13

RIGHT OUTER JOIN Example: Products With No Sales

Here’s a question that wants you to use RIGHT JOIN. It’s a Salesforce and Amazon question, where you have to find products that have had no sales.


Tables: fct_customer_sales, dim_product

Link to the question: https://platform.stratascratch.com/coding/2109-products-with-no-sales


The first table in the dataset is fct_customer_sales.

Table: fct_customer_sales
cust_idprod_sku_idorder_dateorder_valueorder_id
C274P4742021-06-281500O110
C285P4722021-06-28899O118
C282P4872021-06-30500O125
C282P4762021-07-02999O146
C284P4872021-07-07500O149

The second one is dim_product.

Table: dim_product
prod_sku_idprod_sku_nameprod_brandmarket_name
P472iphone-13AppleApple IPhone 13
P473iphone-13-promaxAppleApply IPhone 13 Pro Max
P474macbook-pro-13AppleApple Macbook Pro 13''
P475macbook-air-13AppleApple Makbook Air 13''
P476ipadAppleApple IPad

The solution RIGHT JOINs the fct_customer_sales table with the dim_product table. The tables are joined on the column prod_sku_id.

This way of joining tables will get you the list of all products from the table dim_products. In case some products can’t be found in the fct_customer_sales, they will be shown as NULL.

And this is exactly why the condition in WHERE includes only NULL values, as they represent products without sales.

SELECT p.prod_sku_id,
       p.market_name
FROM fct_customer_sales s
RIGHT JOIN dim_product p 
ON s.prod_sku_id = p.prod_sku_id
WHERE s.prod_sku_id IS NULL;

The code shows four products without sales.

All required columns and the first 5 rows of the solution are shown

prod_sku_idmarket_name
P473Apply IPhone 13 Pro Max
P481Samsung Galaxy Tab A
P483Dell XPS13
P488JBL Charge 5

Now, you might ask if it is possible to get the same result using LEFT JOIN. It is, as LEFT JOIN and RIGHT JOIN are mirror images of each other.

The above code can be amended this way. I’ve only switched the order of the tables and changed the join type from RIGHT JOIN to LEFT JOIN.

SELECT p.prod_sku_id,
       p.market_name
FROM dim_product p 
LEFT JOIN fct_customer_sales s 
ON s.prod_sku_id = p.prod_sku_id
WHERE s.prod_sku_id IS NULL;

As you can see, the output is exactly the same as earlier.

All required columns and the first 5 rows of the solution are shown

prod_sku_idmarket_name
P473Apply IPhone 13 Pro Max
P481Samsung Galaxy Tab A
P483Dell XPS13
P488JBL Charge 5

FULL OUTER JOIN Example: New Products

In this question by Salesforce and Tesla, you need to count the difference between the number of products companies launched in 2020 compared to the same number from the previous year.


Table: car_launches

Link to the question: https://platform.stratascratch.com/coding/10318-new-products

The only table you get is car_launches.

Table: car_launches
yearcompany_nameproduct_name
2019ToyotaAvalon
2019ToyotaCamry
2020ToyotaCorolla
2019HondaAccord
2019HondaPassport

There are two subqueries in the solution. The first one outputs the company and product names that were launched in 2020.

SELECT company_name,
       product_name AS product_2020
FROM car_launches
WHERE year = 2020;

The second subquery does the same but for the year 2019.

SELECT company_name,
       product_name AS product_2019
FROM car_launches
WHERE year = 2019;

I have to join these two subqueries somehow so that I can use COUNT() to find the difference between the two years. The right choice is to use FULL OUTER JOIN. Why? Remember, it’s a join that outputs all the rows from both tables. I need exactly that!

Now I can use COUNT() and DISTINCT to count products launched in 2020 and 2019 separately and subtract two results.

I group the output by the company name because the difference between the products launched must be shown on a company level; as required by the question.

To make the output more readable I sort it by the company name alphabetically.

SELECT a.company_name,
       (COUNT(DISTINCT a.product_2020)-COUNT(DISTINCT b.product_2019)) AS net_products
FROM
  (SELECT company_name,
          product_name AS product_2020
   FROM car_launches
   WHERE year = 2020) a
FULL OUTER JOIN
  (SELECT company_name,
          product_name AS product_2019
   FROM car_launches
   WHERE year = 2019) b ON a.company_name = b.company_name
GROUP BY a.company_name
ORDER BY company_name;

Here’s the net difference by company.

All required columns and the first 5 rows of the solution are shown

company_namenet_products
Chevrolet2
Ford-1
Honda-3
Jeep1
Toyota-1

Inner Join vs. Outer Join: Advantages and Disadvantages

When talking about inner join vs. outer join advantages and disadvantages, there really aren’t any per se.

Which join you will choose really depends on what you need. So, you must be very clear about what you want to achieve. If you choose the wrong join type then its characteristics will, obviously, become disadvantages for you.

Advantages and Disadvantages in Inner Join vs Outer Join

Tips for Optimizing Queries With Joins

There are some ways of optimizing queries where you use joins. The most obvious and important one is to choose the join type according to your needs. In other words, don’t use an outer join when an inner join suffices. That way, you’ll avoid outputting unnecessary data and slowing down your query.

Other tips for query optimization are:

1. Use Specific Column Names: In SELECT, specify only the columns you need instead of using SELECT *. Of course, the query is faster when selecting fewer columns.

2. Filter Early: Apply the WHERE clause before joining. That way, you can reduce the size of the dataset being joined.

3. Indexing: If the columns used for joining are indexed, this can significantly reduce the query’s running time.

4. Analyze Query Plans: Use your database’s features to analyze a query’s execution plan and optimize it accordingly.

Conclusion

Inner and outer join are two of the most used join families in SQL. Knowing their differences and distinct uses separates an SQL noob from an expert. Choose the wrong join type, and you’ll end up with completely skewed results and your boss jumping down your throat.

To avoid this uncomfortable situation, make sure you’re fluent in SQL joins. For that, use StrataScratch’s coding questions for practice and blog articles for theoretical explanations.

Comparison Between Inner Join vs Outer Joins


Become a data expert. Subscribe to our newsletter.