Get data science interview questions delivered to your inbox
Please enter a valid email address.

SQL Interview Questions You Must Prepare: The Ultimate Guide

Are you wondering what SQL interview questions you will be asked? This ultimate guide will take you through the top SQL interview questions for various data positions and the tips to approach your next SQL interview.

This interview guide should be useful to anybody preparing for the SQL job interview. I will approach this guide by going through the SQL interview question types and explaining what topics these questions cover and how to approach answering them. Everything you read will also be backed up by several real-life interview questions from big companies. Finally, I will provide you with question examples, solutions, and solution explanation.

That way you can prepare both on a theoretical and practical level. But, of course, this guide is not only for those preparing for the job interview. You can use it to brush up your skills, practice some concepts you’re not too familiar with, or even make your own job interview questions if you want to employ someone.

If you’re a beginner, maybe you don’t fully grasp how new or important SQL is. And what its purpose is. So if you’re interested in working with data in any capacity, I would advise you to start learning SQL. This guide could be a good starting point for you to see where SQL is required, what its possibilities are, and who needs to use it.

Let’s see why SQL is required for interviews.

Why SQL Questions in the Interviews?

We live in a highly computerized world where we’ve become more and more dependent on technology. For a technology to work, it needs data, which is currently collected frantically on every level of human existence. To store such a humongous amount of data (with growth potential), you need to store that data somehow. That’s where the databases enter the room. Databases have entered our lives and are everywhere, even though most of the time, we don’t realize that.

Relational databases and SQL are not that new. They were both developed in the early 1970s. So they’re now 50 years old. But even now (or maybe especially now!) SQL is still the most popular language in handling databases.

Out of the ten most popular databases in May 2021, the four most popular use SQL. Those are Oracle, MySQL, Microsoft SQL Server, and PostgreSQL. It’s not only that. Among the top ten most popular databases, there are seven databases that use SQL. Only three in the top ten don’t use SQL: MongoDB, Redis, and Elasticsearch.

Data-driven. Aaaah, I hate this phrase. But, unfortunately, you probably see it more and more if you’re looking at the job ads. It’s irritating and overused, but it shows you that even “normal” companies, not to mention the high-tech ones, are realizing how important data is. How important SQL is.

OK, but how often does SQL appear in the job interview questions? Our analysis of 903 data science interview questions showed the most prominent interview question category is coding. It comprises 34% of all the interview questions.

The most prominent concept tested in the coding segment was writing SQL queries. So you could easily put the equal sign between the coding questions and the SQL questions.

The answer to the question of how often does SQL appear in the interview question would be: so often it comprises more than one-third of all the data science interview questions.


This is the answer to why you should know SQL if you want to come near any data job. But why is SQL so popular?

Why is SQL so Popular in Working With Data?

One of the reasons for sure is that SQL has been around for some time. Like I said, it’s around 50 years old. During that time, it has been tested in different scenarios, and it showed to be very reliable.

It’s not surprising when you take into account that SQL was created with the purpose of handling data and databases. Its creators obviously did a great job.

SQL is also quite easy to learn and easy to use. To help you with that, there is a vast community that shares knowledge and makes using SQL even easier.

The fact that it’s an open-source programming language is probably one of the reasons for its popularity.

How to Approach the SQL Job Interview?

Being good at SQL is the prerequisite to do well at the job interview. However, it’s not the only skill. Questions can be tricky, designed to put you off or doubt your knowledge by being seemingly too complicated or too simple.

That’s why it’s important to have a clear strategy on what to do in certain situations.

1. Make Sure you Understand What is Required

If you don’t understand what the question is and what is expected from you, you’ll probably get the wrong answer. To avoid that, make sure you understand what is asked of you. Repeat the requirements out loud and ask the interviewer to confirm you understood the question correctly. Don’t be afraid to do that. The interviewers are people too. They can also unintentionally be unclear, make a mistake, or forget to give you enough details for you to answer correctly.

2. Outlay Your Approach

Before you start answering, especially if you’re writing a SQL code, outlay your approach. That way, you’ll be able to find the solution easier or find the holes in the code you intended to write. You should do that to allow the interviewer to lead you through in case you missed the point of the question. It’s always better to be corrected before presenting the final solution.

3. Try to Visualize the Output

This is something that can help you in writing the problem-solving code. Sometimes, when you clarify how the output should look and compare it with initial data, the approach and the solution reveal themselves.

4. Write the SQL Code

At some point, you’ll need to start writing the code. As I said, you shouldn’t just jump to it. But you also can’t avoid writing it forever. After you’ve gone through all those previous steps, and you’re still not sure if you have the right solution, simply start writing the code. One of the reasons is, sometimes there’s no solution at all. Meaning the question is too complex to be solved in the time you’re being given. In such cases, the interviewer is not interested in your solution. Instead, he or she is interested in your way of thinking and how you approach the problem. There are usually multiple ways to use SQL for problem-solving, and this is what some interviewers are interested in: the process, not the goal.

One of the reasons you have to start writing the code eventually is that it’s better to provide some solution than trying to find the perfect solution and writing nothing.

5. Code in Logical Parts

When you’re writing the code, pay attention to its structure. Divide the code into logical parts. That way, you’ll make your code easier to read, which is also one of the requirements to get the job. There’s no point in writing a correct code that is a mess, and nobody can read it and understand it after you write it. Not even you. If your code is divided into logical parts, it will be easier for you to explain to the interviewer what you did.

6. Optimize Code

It’s also important to have the code’s optimization in mind. If your code is complex, of course, you’re not going to be able to optimize it as you write. But you should pay attention to some general optimization guidelines, so your code is reasonably optimized. You can also discuss with the interviewer what other steps you’ll have to take to optimize your code in the aftermath. This is also the job requirement, similarly to the previous point. There’s no point in writing the code that will get you the required result, but it takes forever to execute it and is unreadable.

7. Explain Your Answer and Assumptions

Even if you didn’t get the required answer, it doesn’t mean you failed the interview. That’s why you should always know why you did something and explain why you did it. Maybe you didn’t get the answer to the question they asked, but you did get the answer to some questions. So make sure that you state your assumption and explain why you did what you did. Again, they may be looking exactly for that: the right reasoning in line with assumptions, even though the assumptions were wrong. That also shows you know what you’re doing, even if it’s not what they asked.

Also, one of the reasons for explaining the assumptions is there may be a loophole in the question. So imagine pointing at it right there at the interview by simply explaining why you did something while you thought you were all wrong.

Here’s the video where we share some tips on how to organize your SQL interview solution:

SQL Interview Question Types

There’s no only one type of question that can be asked in the interview. There are plenty of distinguished layers the SQL can be divided into. Of course, every company has, more or less, its own approach to testing SQL knowledge. It depends on how widely the SQL is being used in the organization.

It also depends on which position you’re applying to. Different positions require different types and different levels of SQL knowledge. For instance, if you're a reporting analyst, you for sure won’t be asked the same questions as a data scientist, SQL software developer, database administrator, or a machine learning engineer.

The SQL interview questions across the different industries and positions can be categorized like this:

  • theoretical SQL interview questions
  • SQL basics interview questions
  • SQL reporting interview questions
  • problem-solving SQL interview questions
  • ETL interview questions
  • database modeling interview questions
  • logic-based SQL interview questions

Analysts and Data Scientists – What SQL Interview Questions to Expect?

If you work as an analyst, you probably already use SQL, so you know it’s almost a standard requirement. It doesn’t matter if you’re a data analyst, reporting analyst, product analyst, or even financial analyst. Your position generally requires handling the raw data and using your skills to provide the management and other stakeholders with the decision-making insight.

The five main types of questions that you should expect are:

  • theoretical SQL interview questions
  • SQL basics interview questions
  • SQL reporting interview questions
  • problem-solving SQL interview questions
  • ETL interview questions

Theoretical SQL Interview Questions

These are the questions that don’t require any code writing. They will test your knowledge of certain basic database and SQL terms. Concepts you should prepare are:

  • general information about SQL
  • general information about the (relational) databases and how they work
  • SQL sublanguages and their main keywords
  • data types and how SQL handles it (including blanks and NULLs)
  • attribute constraints
  • types of JOINs
  • aggregate functions

Questions Example

Some of the theoretical questions you should expect are:

  • What is SQL?
  • What is a relational database?
  • What is a primary key?
  • What do DDL, DCL, and DML stand for? Give examples of commands for each.
  • What are the common data types in SQL?
  • What are attribute constraints, and explain them.
  • What is the difference between inner join and left outer join?
  • What are the aggregate functions?

SQL Basics Interview Questions

The SQL basics interview questions will generally ask you to put some of the above theoretical questions into practice. That doesn’t mean these questions have to be coding questions. They can be descriptive too. But they usually cover concepts you’ll need to know if you want to write a code. Those concepts are:

  • using SUM(), COUNT(), AVG(), MIN(), MAX() and other aggregate functions
  • JOINs
  • GROUP BY
  • CASE WHEN statement
  • WHERE and HAVING
  • UNION and UNION all

Questions Example

One of the examples is the question they ask at Postmates:

“How many customers placed an order and what is the average order amount?”

Answer: To answer this question you’ll have to use the table postmates_orders.

SELECT 	 count(DISTINCT customer_id),
       	avg(amount)
FROM postmates_orders

As you can see, this is really an easy one. It tests the aggregate functions COUNT() and AVG(), and it also requires the knowledge of how to use the DISTINCT clause.

Here’s another one, this time from Credit Karma :

“Write a query that returns the user ID of all users that have created at least one ‘Refinance’ submission and at least one ‘InSchool’ submission.”

Answer: To answer this question you’ll have to use the table loans.

SELECT user_id
FROM loans
WHERE TYPE in ('Refinance',
               	'InSchool')
GROUP BY user_id
HAVING count(DISTINCT TYPE) =2

This code selects the column user_id from the table loans where the column type equals two values: “Refinance” and “InSchool”. After you’ve filtered the data, you’ll have to group it on a user level because that’s what you’re asked to do. Since your task is to get only those users with at least one of both submissions, you’ll have to use the HAVING clause with the COUNT() functions and DISTINCT clause.

SQL Reporting Interview Questions

Reporting sounds like something that the reporting analysts should use a lot. The purpose of reporting is to transform data into information. This information is presented in the form of reports, dashboards, and charts. What you report are different metrics that need to be monitored. So, basically, your two main tasks will be aggregating and filtering data and performing various calculations on that data.

While strictly speaking, the data analysts are not reporting but analyzing the data, those two main tasks basically remain the same for the data analysts too.

Because of that, expect these reporting questions to be most common if you’re a reporting or data analyst. But what are the concepts that fit into this category?

If you’re a reporting analyst, make sure you go through these concepts before the interview:

  • subqueries
  • joins and self-joins
  • window functions
  • CTEs
  • GROUP BY extensions (ROLLUP, CUBE, GROUPING SETS)

Questions example

One of the moderately complex reporting questions is this one from Zillow:

“Write a query that identifies cities with higher than average home prices when compared to the national average. Output the city names.”

Answer: To answer this question you’ll have to use the table zillow_transactions.

SELECT city
FROM zillow_transactions a
GROUP BY city
HAVING avg(a.mkt_price) >
  (SELECT avg(mkt_price)
   FROM zillow_transactions)
ORDER BY city ASC

In this code, you’ll have to know how to write a subquery in the HAVING clause. The code selects only the column city from the table zillow_transactions and groups the data on the city level. It then filters the result using the HAVING clause. In the HAVING clause, you’ll first have to calculate the average market price for every city, which you’ll do by using the aggregate function AVG(). Then you have to compare it with the national average. You’ll get the national average using the AVG() functions in the subquery. Finally, the data is ordered by the city in ascending order.

A little more complex question is the one from Lyft:

“Find the top 10 users that have traveled the greatest distance. Output their names and total distance traveled.”

Answer: To answer this question you’ll have to use the tables lyft_rides_log, lyft_users.

SELECT name, traveled_distance
FROM
    (SELECT lu.name,
           SUM(lr.distance) AS traveled_distance,
           rank () over (order by SUM(lr.distance) desc) as rank
    FROM lyft_users AS lu
    INNER JOIN lyft_rides_log AS lr ON lu.id = lr.user_id
    GROUP BY lu.name
    ORDER BY traveled_distance DESC
    ) sq
WHERE rank <= 10

Of those SQL reporting concepts, this question tests subqueries, window functions, and joins. First of all, you should write the subquery in the FROM clause. In this subquery, you’re finding the sum of the distance traveled by every user. Then you’re using the RANK() window function to rank this total distance per user in descending order. To get the data, you have to join tables. And you should do that using the INNER JOIN keyword because you only want the matching rows from both tables. Afterward, the data is grouped by the user name and ordered in a descending order. Finally, you’ll have to filter the result using the WHERE clause to get only the top ten users.

In this video, we show you some top data science interview questions, how to solve them and avoid common mistakes.

Problem-Solving SQL Interview Questions

The problem-solving interview questions might seem the same as the reporting questions at first sight. I admit they have one thing in common. That is the requirement to write an SQL query that will return some metric(s) as a result, the same as with the reporting SQL questions.

However, there’s one big difference. When asked the reporting SQL question, you’ll be given a very specific calculation that your output needs to show. Like in the above examples, the average home prices or the top ten distances traveled. In the problem-solving SQL questions, you’ll not get that. You’ll be asked to find the insight. But it’ll be up to you to understand your data and what calculation answers what you’re being asked. For example, you’ll have to find out if some product launch campaign succeeded, or new calling procedure saves the costs, or if new vehicles improved the users’ satisfaction. It will be up to you to think of metrics that will be equal to “success”, “saving”, or “satisfaction improvement”.

Compared to the reporting SQL questions, these questions have this extra dimension designed to test your thinking in solving the problem. That’s why they’re called problem-solving questions.

Regarding the coding part of the problem-solving questions, they test all the concepts you’ll use in the basic level and the reporting SQL questions. Depending on the question's difficulty, of course.


Questions Example

Have a look at this question asked by Facebook:

“Facebook has developed a search algorithm that will parse through user comments and present the results of the search to a user. To evaluate the performance of the algorithm, we are given a table that consists of the search result the user clicked on ('notes' column), the user's search query, and the resulting search position that was returned for the specific comment.
The higher the position, the better, since these comments were exactly what the user was searching for. Write a query that evaluates the performance of the search algorithm against each user query.”

Answer: To answer this question you’ll have to use the table fb_search_results.

SELECT t.result_id,
      t.query,
      CASE
          WHEN t.check = FALSE THEN 1
          WHEN t.check = TRUE
              AND t.position >= 11 THEN 2
          WHEN t.check = TRUE
              AND (t.position BETWEEN 6 AND 10) THEN 3
          WHEN t.check = TRUE
              AND (t.position BETWEEN 4 AND 5) THEN 4
          WHEN t.check = TRUE
              AND t.position <=3 THEN 5
      END AS rating
FROM
  (
SELECT query,
              result_id,
              position,
              notes,
              (regexp_replace(notes, '[^\w]+',' ','g') ilike concat
              ('% ', query,' %')) AS check
   FROM fb_search_results
  ) t

Of the important coding concepts, this one includes the CASE WHEN statement and writing the subquery in the FROM clause. In this subquery, you’re selecting columns from the table fb_search_results. There’s one additional column called check. It uses the REGEXP_REPLACE, ILIKE, and CONCAT function to check if the term the user searched for (column query) is the same as in the comment (column query). The main query uses this check column in the CASE WHEN statements. If the result is false, then this row will get a rating of 1. If it’s true and the position is equal or above 11, give a rating of 2. And so on, with the ratings being: between 6 and 10 = 3, between 4 and 5 = 4, between 1 and 3 = 5.

This is a problem-solving question because you weren’t given the metrics which would differentiate excellent performance from not-so-great one. Therefore, you had to decide on your scale that will show how the performance was. Your scale could be different from the one in the solution above. However, it’s important that you explain your assumptions and why you decided to go with a certain evaluation scale.

ETL Interview Questions

ETL is short for Extract, Transform, Load. What does it mean in the context of data and SQL? It is a process used to collect data from various sources (extract), changing it according to the business rules (transform), and then loading such extracted and transformed data into a database.

When the data is extracted, it is done so from various data sources that, more often than not, store data in completely different formats.

By transformation, the data takes the format appropriate for reporting and analysis. The data is transformed via data aggregation, filtering, sorting, joining, a calculation based on the rules set for business needs, etc.

Such data is loaded into another database or table that the analysts or any other users might use.

The ETL is heavily used in data warehouses, which serves as the central source of the integrated data, with data flowing into it from one or more separate sources. Since data warehousing is a core component of business intelligence, it’s no wonder the ETL SQL questions analysts would use.

If you want to perform well at the SQL job interview, these are the ETL concepts you need to know:

  • Data Definition Language (DDL) keywords
  • Data Manipulation Language (DML) keywords
  • Data Control Language (DCL) keywords
  • Transaction Control Language (TCL) keywords
  • SQL constraints
  • JOINs
  • indexes
  • transactions
  • views
  • user-defined functions
  • stored procedures
  • triggers
  • variables
  • query optimization


Question Examples

One of the common (and the easiest) question is this one from Southwest Airlines:

“What is the difference between DELETE and TRUNCATE?”

Answer: DELETE is a DML statement. TRUNCATE is a DDL statement. The DELETE statement can be used to delete all rows or only some rows. To delete some rows, you’ll have to use the WHERE clause. While doing this, every row removed will be logged as an activity by the database. On the other hand, TRUNCATE is used only for deleting the whole table, which will be logged as only one action. That’s why TRUNCATE is faster than DELETE, which shows when deleting a table with a huge amount of data. Also, you can’t use TRUNCATE if there’s a foreign key in the table.

Another question could be:

“How do you change a column name by writing a query in SQL? ”

Answer: I’m assuming you’re using PostgreSQL. I’m also assuming there’s a table named product. One of the columns is year, but I want to rename it to description. The query that will do that is:

ALTER TABLE product
RENAME year TO description;

Another example of ETL SQL questions can be:

“How do you create a stored procedure?”

Answer: Let’s assume you’re now using the Microsoft SQL Server. For example, you’re using the table employee. Your procedure should help you get the employees that work in a certain department. Here’s the code:

CREATE PROCEDURE employee_deparment @deparment nvarchar(50)
AS
SELECT * FROM employees WHERE department = @department
GO;

Once the procedure is created, I can run it this way:

EXEC employee_deparment @department = 'Accounting';

Data Engineers – What SQL Interview Questions to Expect?

Data engineers work closely with data scientists and data analysts. Their main task is building and maintaining the data architecture and creating algorithms to allow data scientists and analysts easier access to data. By doing their job, they’re helping data scientists to do their job too. What you can’t avoid as a data engineer is knowing SQL, often on an advanced level compared to data analysts and scientists. To be a good data engineer, you need to be an SQL master. That’s why some questions you’ll be asked are the same as with data analysts and scientists. However, there’s one additional question type that tests one of the main data engineers’ requirements: database modeling. Here are three main SQL question types you should be ready for:

  • SQL reporting interview questions
  • ETL interview questions
  • database modeling interview questions

The data engineers need to know how to optimize data retrieval and maybe, depending on the organization and job description, be responsible for reporting and creating dashboards. That’s why they need to be ready to answer the reporting SQL questions.

Also, knowing how to answer the ETL SQL questions has to do with building the data architecture and allowing data access. That’s when they’ll be using a lot of extracting, transforming, and loading of data. But you already know which concepts these question types generally cover.

However, we haven’t talked about the database design SQL interview questions so far.

Database Modeling Interview Questions

These questions are designed to test how good you are at database design or database modeling. What is meant by that? You need to show the ability to design and build the database from scratch according to the business processes and business needs. This requires a high level of both technical and business knowledge. You’ll be working with both technical and non-technical colleagues. So you need to understand both the business side of their requirement and how to, in the soundest way, technically cater to their business needs regarding the data. Generally, this is a process that goes through these steps (at least in the ideal world):

  1. defining the database purpose
  2. collecting and defining users’ requirements
  3. creating conceptual model
  4. creating logical model
  5. creating physical model

Question Examples

One of the typical questions that occur in the interviews is this one by Audible:

“Can you walk us through how you would build a recommendation system?”

Answer: Since there is such a variety of approaches to answer this question, we will leave you to come up with your own way of building one.

The database design question can also include SQL coding, such as this one from Facebook:

“Write a SQL query to compute a frequency table of a certain attribute involving two joins. What if you want to GROUP or ORDER BY some attribute? What changes would you need to make? How would you account for NULLs?”

Answer: Due to the nature of the question, we’ll let you answer this one on your own.

Machine Learning Engineers – What SQL Interview Questions to Expect?

Machine learning engineers are some kind of hybrid experts who are bridging the gap between data scientists and software engineers. As they serve as a bridge between those two positions, they need to have a certain set of skills from both worlds. They’ll use those skills to design, build, and maintain the machine learning systems. To achieve that, they usually use several sets of skills:

  • statistics
  • mathematics
  • data mining
  • data analysis
  • predictive analytics

Machine learning engineers usually need to know Python and/or R. However, since machine learning engineers and data scientists have some skills in common (data mining, data analysis), it’s quite often required from the machine learning engineers to know SQL as well. That way, they are able to make their own analyses and use the data according to their needs. They don’t need some intermediary who’ll pull out the data and analyze it for them.

The SQL interview questions for machine learning are usually one of the two types:

  • SQL reporting interview questions
  • ETL interview questions

Software Engineers – What SQL Interview Questions to Expect?

The software engineers’ skills are also usually at an intersection of various skillsets, such as computer science, engineering, and mathematics. They use those different disciplines to design, write, test, and maintain the software. Like the machine learning engineers, they’ll also need to work with various departments and clients. That’s why they too need a high level of business and technical skills. Even though their primary task is not data analysis, they need to have data analysis skills. Why is that? When they build the interface, they have to lean on the database(s) that run in the background. They’ll need to use those databases and analyze data during the implementation of new software.

If you apply for a software engineer job, you’ll most likely be asked questions from one of those (or all) categories:

  • SQL basics interview questions
  • ETL interview questions
  • database modeling interview questions
  • logic-based SQL interview questions

Now, you’re familiar with the first three question types. But what about the logic-based SQL questions? How do they differ from other questions?

Logic-based SQL Interview Questions

Those are the questions that require SQL knowledge, but usually, what is being asked is not too useful in practice. They are used at the interview because, as a software engineer, you won’t be writing SQL codes every day. SQL simply is not what you need to be the best at. However, you still need to have very strong SQL skills but more on logic than a coding side. When you talk to other colleagues who use SQL daily, you need to follow what they’re talking about and implement their needs and SQL logic into your software.

That’s why the question such as the following one could occur at your job interview:

Imagine you’re working with two tables. The one is the product table, which has the following data:

  • id
  • product_name
  • manufacturer_id

The second table is manufacturer with the following data:

  • id
  • manufacturer

There are 8 records in the first table and 4 in the second one.

How many rows will the following SQL code return:

SELECT *
FROM product, manufacturer

Answer: The query will return 32 rows. Whenever the WHERE clause is omitted, the default result is CROSS JOIN or a Cartesian product. This means the query will return every combination of rows from the first table with every combination of rows from the second table.

The SQL Interview Questions Asked by the FAANG Companies

FAANG is an acronym for the five most famous tech companies: Facebook, Amazon, Apple, Netflix, and Google.

Why would you specially prepare for the questions asked by those FAANG companies? Except being in awe of the possibility of working for them? They might seem or even be attractive, but that’s not the main reason why you would pay special attention if you want to work at those companies.

The main reason is their SQL interview questions are a little bit different. As tech companies, their business heavily relies on data. And where is data, there is SQL which the FAANG companies often use. Their employees knowing SQL is especially important for them, so they always ask the SQL interview questions with a little twist. The twist being their questions are more practical and concerning a case study with real problems and data a certain company is facing in their everyday business. Despite that, those questions usually fall into the following categories:

  • SQL basics interview questions
  • SQL reporting interview questions
  • problem-solving SQL interview questions

Have a look at this example from Google:

“Find the email activity rank for each user. Email activity rank is defined by the total number of emails sent. The user with the highest number of emails sent will have a rank of 1, and so on. Output the user, total emails, and their activity rank. Order records by the total emails in descending order. Sort users with the same rank score in alphabetical order.
In your rankings, return a unique value (i.e., a unique percentile) even if multiple users have the same number of emails.”

Answer: To answer this question, you’ll need to use the google_gmail_emails table.

SELECT  from_user, 
        count(*) as total_emails, 
        row_number() OVER ( order by count(*) desc)
    FROM google_gmail_emails 
    GROUP BY from_user
    order by 3, 1

You see, this question is testing your aggregate functions and window functions knowledge, along with the GROUP BY and ORDER BY clauses. But they also do that on real-life problems you’ll probably have to work on if you get a job.

Here’s another example of such question, this time from Netflix:

“Find the nominee who has won the most Oscars.
Output the nominee's name alongside the result.
Order the result based on the number of wins in descending order.”

Answer: To answer this question, you’ll need to use the oscar_nominees table.

SELECT
    nominee,
    count(winner) AS n_times_won
FROM oscar_nominees
WHERE 
    winner = true
GROUP BY 
    nominee
ORDER BY
    n_times_won DESC

Again, this question tests some usual concepts. But not on some metaphysical examples that have nothing to do with what Netflix does. If you work at Netflix on an SQL job, you’ll for sure analyze some data that contains some Oscar nominations and winners.

Overview of SQL Concepts you Should Know in Theory and Practice

You’ve probably noticed that the theoretical SQL interview questions overlap with other SQL questions. It’s because one doesn’t work without the other. There’s no point in knowing the theory without being able to put it into practice, i.e., the SQL code. There’s similarly no point in being able to write a code without understanding and being able to describe what you’re doing. Actually, I think this is virtually impossible. While the SQL concepts you should know depend on your position, years of experience, and the company you want to work at, some concepts are unavoidable. Here’s what they are, with the short descriptions intended for a quick review.

SQL & Database Generalities

SQL definition

SQL stands for “Structured Query Language”. It’s a programming language used for creating database structure, retrieving and manipulating data in it.

Types of the SQL commands

  • Data Definition Language (DDL)
    • CREATE
    • ALTER
    • DROP
    • RENAME
    • TRUNCATE
    • COMMENT
  • Data Query Language (DQL)
    • SELECT
  • Data Manipulation Language (DML)
    • INSERT
    • UPDATE
    • DELETE
    • MERGE
    • CALL
    • EXPLAIN PLAN
    • LOCK TABLE
  • Data Control Language (DCL)
    • GRANT
    • REVOKE

Relational database

A relational database is one based on the relational data model. This means the database is a collection of relations. Those relations are shown as tables, which consist of columns, rows, and values. The relational database aims to minimize or completely avoid data redundancy, leading to data integrity and speeding up its retrieval.

Relationships in the database

The relationship defines the type of connection between the tables in the database. There are three main types of relationships:

  • one-to-one relationship (1:1)
  • one-to-many relationship (1:N) or many-to-one relationship (N:1)
  • many-to-many relationship (M:N)

Database normalization

Database normalization is a process of organizing data in the database to achieve its purpose: data integrity, its non-redundancy, and speed of retrieval.

Constraints

The constraints are the rules that define what type of data can and can’t be entered as a value in the database. The most common attributes are:

  • NOT NULL
  • CHECK
  • UNIQUE
  • PRIMARY KEY
  • FOREIGN KEY

Indexes

The indexes are structures in the databases created to tune the database performance. They are used to speed up data retrieval. The types of indexes are:

  • clustered index
  • non-clustered index
  • unique index
  • filtered index
  • columnstore index
  • hash index

View

A view is a virtual table containing data from one or more tables resulting from a SQL statement.

Stored procedure

A stored procedure is an SQL code consisting of one or several SQL statements that are saved and can be called and executed whenever required.

Trigger

A trigger is a special type of a stored procedure. It is automatically executed (triggered) whenever some special event occurs in the database.

Joining Tables & Queries

Inner join

It’s a type of join that returns all the rows where the data from one table matches the data from the second table.

Left outer join

The left join is a table join that will retrieve all the rows from the left table and only the matching rows from the right table.

Right outer join

This join is the one that returns all the rows from the right table and only the matching rows from the left table.

Full outer join

The full outer join will join the data so that the result will include all the rows from one table and all the rows from the second table.

Cross join

It’s a type of join that results in a Cartesian product. This means it will return all the combinations of rows from one table with all the combinations of rows from the other table.

Union

This is an SQL command that will combine the result of one query with the result of another query. Therefore, it will show only unique records.

Union all

This one also combines the results from two or more queries. The difference is it’ll also include duplicates.

Aggregating and Grouping Data

Aggregate functions

The aggregate functions perform a calculation on a data set and return a single value as a result. Examples of the aggregate functions are:

  • COUNT()
  • SUM()
  • MIN()
  • MAX()
  • AVG()
  • STDEV()
  • VAR()

GROUP BY clause

The GROUP BY clause allows you to group data according to the defined (one or more) criteria.

Filtering & Ordering Data

DISTINCT clause

The DISTINCT clause is a clause that will return only distinct or unique values, i.e., there will be no duplicate values in the result.

WHERE clause

The WHERE clause is used to filter data according to the specified criteria.

HAVING clause

The HAVING clause also filters data according to the specified criteria. The difference compared to the WHERE clause is that the HAVING clause works with the aggregate functions. Therefore, if used, it always follows the GROUP BY clause and precedes the ORDER BY clause.

ORDER BY clause

The ORDER BY clause is used to order the query result according to a certain data column.

CASE statement

The CASE statement returns a defined value based on certain criteria. It is the SQL statement that allows you to apply the IF-THEN logic. Instead of IF, you use WHEN. And for THEN, you use THEN.

Subqueries, Common Table Expressions (CTEs) & Window Functions

Subquery

A subquery is a query found within the query. It can occur in a SELECT clause, FROM clause, or WHERE clause.

CTE

A CTE or a Common Table Expression is a temporary result set returned by a query and used by another query. In that way, it’s similar to subquery. But the main difference is CTE can be named and can reference itself.

Window functions

The window functions are SQL functions performing calculations over the defined set of rows (a window). Compared to the aggregate functions, which return a single value as a result, the window functions allow you to add the aggregated value to each row in a separate column. This means the row are not grouped and all the rows are kept as a query result. The window functions are:

  • row_number()
  • rank()
  • dense_rank()
  • percent_rank()
  • cume_dist()
  • lead()
  • lag()
  • ntile()
  • first_value()
  • last_value()
  • nth_value()
  • avg()
  • count()
  • min()
  • max()
  • sum()

Some Additional SQL Interview Tips

Here are some additional tips that might help you to be a success at the upcoming SQL interview.

Get to Know the Your Potential Employer

This is important in general, not only for the SQL part of the interview. It’s important to be informed about your future employer, their products, and their industry. It is especially important when the SQL questions are regarded. Why is that? As I mentioned, the FAANG companies will usually ask you very practical SQL coding questions that will have you use the same data and solve the same problems as you would have to when you get employed. The FAANG companies are not the only ones who do that. So when you prepare for the interview, try to think which data is important to this company, how their database could look like, etc. When you practice the SQL questions, try to find the real questions from the companies you’re interested in or at least from their competitors. If the companies are in the same industry, it’s quite likely the data they use will be more or less the same.

Be Prepared for a Whiteboard

It’s quite usual to be asked to write SQL code on a whiteboard. It can be shocking to some people, which is understandable. You’re probably used to writing a code in a real RDBMS, on real data, which allows you to regularly check if the query works. Not even the greatest masters of SQL can write a code without running it to see if it works at all or if it returns the desired result. However, in the SQL interview, the criteria are a little higher. While it can be scary, it’s also understandable. Writing the code on a whiteboard shows that you know how to write your code. Reading your (or someone else’s code) is also important. This is the skill that is also tested on a whiteboard. If you can read a code and say if it’ll give you the desired result without relying on the database to tell you that, then working with a real database and SQL environment will be easier for you.

Write a Clean Code

I’m not talking about your handwriting. There’s not much you can do if your handwriting is messy. But that doesn’t mean your code has to be unreadable. When you write a code, try to format it so that it’s easier for you and the interviewers to read it and check your solution.

Regularly use spacing and line breaks to make your code easier to read. If you need to (re)name tables and columns, be consistent with the naming convention you choose. Add comments whenever needed. Try to use aliases whenever possible, but try to make them sound logical and not some random decision when you do.

Here are also some useful tips on how to organize lengthy SQL codes.

Write in a Company’s SQL Dialect Only if You’re Comfortable With it

If you have experience with multiple SQL databases (Oracle, PostgreSQL, Microsoft SQL Server, MySQL), try to adapt and write in a dialect of the database that is used at your future employer. That would be nice and could show your versatility, especially if you know what dialect they prefer. However, if you’re familiar only with one dialect, don’t think that’s the end of the interview. For example, if you were using only PostgreSQL and the company is using Microsoft SQL Server, there may be different keywords for the same command in those two databases. It’s also possible that PostgreSQL has some functions that aren’t allowed in the Microsoft SQL Server and vice versa. Ask the interviewer if it’s possible to write a code in, say, PostgreSQL instead of Microsoft SQL Server since you’re more familiar with it. Yes, it’s always better if you know several dialects. But it’s also better if you write in a familiar dialect, even though “the wrong one”, than mess up the code just because you were too afraid to ask if you can write in a dialect you’re comfortable with. The differences between the dialects are not that huge. So if you know SQL, you’ll easily and quickly adapt to a new database.

Optimize Your Code

Try to write a code that will be executed as efficiently as possible on a real database. It’s not only important to write a code that will return the desired result – one day. It’s important that it’s done as quickly as possible and without locking out the database for other users. So even before you write a code, try to think about making it as efficient as possible. Even if you’re already presenting your code as a solution, and at a go, you see how it can be improved, ask the interviewer if you can re-write in a more optimized way. Or at least mention how the code could be optimized while you’re explaining it. It’s very important because you can be rejected even if you wrote the correct code if it’s terribly inefficient.

Practice SQL. A lot!

If you don’t have much real-life experience with SQL, it’s very important that you practice writing SQL codes. Do it a lot, and regularly. Continuity is very important. Try to answer as many as possible SQL questions, be it hypothetical or, even better, the real ones from the company you want to work at. Only by writing a lot of code, you’ll gain experience, grasp some typical problems that need to be solved by SQL, and the syntax will become like second nature.

Even if you are vastly experienced in SQL and used it in a business environment, it’s always good to prepare for an interview and brush up your skills. Nobody knows everything about SQL. Generally, people know what they need, what they regularly use every day. So it’s possible that after several years at your job, you became a master of a certain aspect of SQL. Don’t let it make you think there isn’t anything you don’t know. It could be that your new job will be asking of you to know some different SQL functions and possibilities, the ones you’re not exactly versed at. That’s why you must make sure you prepare and practice SQL before the interview.

How to Practice SQL and be Prepared for the Interview?

The SQL interview questions on StrataScratch can be found under the Coding Questions section. They can be filtered by the difficulty (Easy, Medium, Hard) and by the companies. You can also search questions by the categories such as Top Interview Questions, FAANG Companies, Latest Active Interview Questions, The Newest Questions. Or you can choose questions according to the topics, such as Window Functions, Manipulating Text, Manipulating Datetime.

Here are some suggested steps in using StrataScratch questions and its interactive SQL editor:

  1. Carefully read the question and write a code without running the SQL editor. Then, go through your code several times and decide for yourself if the code is correct if it’s optimized. That way, you’ll simulate writing a code on the whiteboard.
  2. Run a code to check if it’s correct.
  3. If the code is not correct, try to debug it by yourself. Maybe it’s just a typo.
  4. If you can’t debug it or, after thinking hard, you still don’t know how to approach the solution, check out the Approach Hints. There are some useful hints that could give you an idea of how to write a solution. You can also use the Expected Output tab to get a better idea of what is expected from you.
  5. Change your code and run it to check if it now returns a correct result.
  6. If it does or if it doesn’t after really trying to solve it, check a Solution Discussion. There you’ll find the official solution and the user’s and StrataScratch Team’s discussion about it. That way, you can learn why a certain approach is used and find out if there are even several different solutions that will give you the same result.
  7. Some questions also have a video that explains the solution.
  8. You can also use the Solutions from Users tab to see other users’ solutions and learn from them.

Listen to some of the tips on how to prepare for the interview: