SQL is a prerequisite skill for all those who are aiming for positions in the Data analytics domain. And like any other core topic, the SQL knowledge of a candidate is tested via written examinations, coding tests as well as interviews. Tackling them requires mastery over the subject, but SQL interview questions pose a different kind of threat than conventional SQL exercises or problems.
Interviews, in general, are more stringent than conventional exams because of the nature of the process. Interviews are one on one interactions between the candidate and the examiner, and hence are harder to crack. More often than not, the person may be facing an interviewing board where you never know what's coming next. The level of stress in these situations is much higher and harder to handle. This is why many fail in cracking interviews.
SQL Interview Questions
SQL interviews come under the technical interview category and hence are purely based on the job and the expectations. Markers like personality and spirit take a back seat, and the interviewers judge the technical ability of the candidate first and foremost. Therefore, SQL interview questions are designed to test your knowledge on the subject, and also your ability to correlate the principles with field work, checking your practical application skills.
Types Of Questions
Since it is impossible to find what the interviewers might ask, it is easier to try and figure out what they look for in an employee. The apparent answers are knowledge of the subject and programming skills. So, SQL interview questions are set to measure those markers.
The “Theory” Questions
These are questions that can be considered as bookish knowledge, as these questions are generally about theory, definitions, classifications, and so on. Knowing this might not have a lot of practical value, but it shows that you have sufficiently broad knowledge about the subject as a whole.
Questions such as definitions of standard terms, terminology, the concepts can be considered as this type.
The “Problem” Questions
These are a little more complicated, as they are not as straightforward as the theory questions. These questions test the true mettle of a programmer and are based on realistic situations. Answering these types of SQL interview questions correctly is crucial, as these will set you apart from the rest. They not only reveal your knowledge, but also showcase your awareness about using it in a professional capacity.
Preparing for SQL Interview questions
SQL is just like any other subject, so preparing for SQL interview questions has the same necessary steps as any other subject. One must have a firm grasp of the basics of the subject and the expected questions. Awareness about the latest developments and current trends are also necessary to nail the interview. Each question is designed to test you in different aspects. Therefore every answer counts.
Here are some of the common questions asked in SQL interviews.
More questions can be found in online resources.
SQL interviews can seem daunting at first, and prove to be impossible to conquer if unprepared. The subject has grown to be so important that it is now deemed as a must have for all Data analysts and scientists. Therefore, mastering it is of paramount importance for all aspirants. SQL interviews can easily be nailed with ample preparation and confidence.
All the best!
To take your data analytics career to the next level, visit https://www.stratascratch.com/.
Structured Query Language or SQL is the backbone of data analytics and data science. SQL assumes data in the form of tables similar to spreadsheets. The language is based on relational algebra, which allows it to sort, filter, and recall data. The language has also undergone many modifications to add new functions and capabilities. This has caused SQL to evolve and branch into many different versions, each of them distinct but all based on the same relational mathematics that forms the skeleton of SQL.
SQL in Data Analytics
SQL has assumed a position of industry standard within the field of Data Analytics. This is because the data structure considered within the program is a spreadsheet format, which has the most amount of applications in businesses. This is also because of the salient features of SQL, which make it easier to use for all.
This has also resulted in SQL being an unavoidable skill for data analysts and developers. Basic SQL knowledge is now tested during the recruitment process of these jobs. SQL and Python problem solving, SQL Interview questions and SQL Exercises are given to aspirants in this sector to measure their proficiency in the language.
SQL Proficiency Testing
As SQL has become an essential subject in the industry, the SQL proficiency of possible recruits is also being put to the test. All companies concerned with data science hire people who are well versed in SQL, among other things.
As it is with coding, tests are conducted by companies to measure the proficiency of possible recruits, and only those who do well are selected for the jobs. SQL exercises, interview questions and SQL problem-solving are the main methods used by these companies to choose the cream of the crop.
Practice results in perfection is a tried and tested principle. Since their early schooling of elementary mathematics, students have been encouraged to practice using the theoretical knowledge that they gain, not only to excel in the subject but also to improve problem-solving and analytical skills.
SQL exercises fulfil the same purpose. The activities are the same as any other exercises for any other subjects. They are available on various online platforms and come in a variety of difficulty levels. There are many benefits to practising SQL Exercises online and solving them.
SQL exercises are designed with two main goals in mind.
Correlating With Theory
SQL exercise questions and answers have to be related to all the main theory parts of the programming language. A good SQL exercise set will have a variety of problems touching upon every part of the language and in various difficulty level. Practising with them as you study helps the student to master the practical applications of the language and hence stay up to date.
Practical Application Of SQL Programming
SQL tutorials will also have problems that mimic the real-life usage of SQL in various sectors. As the course in itself aims at enabling the learner to master SQL at a professional capacity, these problems are necessary. SQL problems and answers will be similar to what a Data scientist will have to encounter in the line of their work.
SQL Exercises Online: Difficulty Levels
As it is with all exercises, this subject also comes with a variety of questions in different tiers of difficulty. Depending on the knowledge and skill of the student, they can choose from varying levels of difficulty of the items, that check the test taker's knowledge on any one specific topic or in the broad sense. This range of difficulty allows the user to not only measure their progress but also work to attain the next tier of their skills.
SQL Exercises Online: The Benefits
It makes sense for SQL problem sets to be available online, as SQL is first and foremost a programming language. Even if a person learns all the theory in the world about SQL, it is still useless without solving SQL problem sets. With that being said, these exercises help the student to improve their skill set in a way that can't be satisfied with conventional schooling.
The obvious reason for this is that SQL is a computing language. The fundamental process is coding, the raw material is digitally structured data, and the application lies in Data Analytics. Therefore, it is only logical for SQL exercises to be done on a digital platform.
Another reason why one should opt for SQL exercises online is due to the nature of the subject. As an essential tool in the digital world, SQL is fluid and ever-evolving. The language itself is changing, and the quality of tasks are becoming more complex. The volume of data handled is increasing, and the queries are becoming more and more complex. The ecosystem is in a constant state of flux, and any professional worth their salt has to keep up with these changes. Therefore, using online resources that provide SQL problem sets make more sense.
SQL is growing in significance and is now an irreplaceable tool for every aspirant in the field. The scope of this subject keeps expanding, and one must be able to keep up with it to reach the top. SQL exercises are available online to enable the technicians to be better at the trade. Practice makes perfect.
To take your data analytics career to the next level, visit https://www.stratascratch.com/.
This blog post follows our “SQL Interview Questions From Real Companies” video which can be found at https://www.youtube.com/watch?v=n6gM265zG68.
In this post we’ll go through 4 SQL questions you’re bound to encounter during a technical interview. While these problems are on the easy side, it’s still important that you bring along the interviewer. You want to show your interviewer your thought process. It’s okay if you don’t have enough time to solve the problem. Interviewers care more about how you solve problems in general than whether you can solve this specific problem. So during an interview, remember to take your time and describe each step to your interviewer.
We’ll use a three-step approach to problem-solving that you can use during your technical interviews. First, remember to build up a query step by step and explain each step to the interviewer. Your interviewer wants to see that you know what you’re doing and why you’re doing it. Second, you should be looking for edge cases throughout an interview. By asking your interviewer questions about edge cases you’ll show the interview your attention to detail. Finally, you should be able to explain to the interviewer the effect of every clause and expression.
We’ll be using Strata Scratch for our SQL exercises. Strata Scratch is a platform that helps you prepare for technical interviews. Every problem in this post is available to you on Strata Scratch.
Question 1: Find the drafts that contain the word optimism.
For our first interview question, we’re given a table called google_file_store and asked to find all the draft files containing the word optimism.
We’ll start every problem by looking at the table. I can pull keywords out of the question and use them to understand the table. Looking for ‘draft’, I see a few file names start with ‘draft’. All draft files must follow the format of the word ‘draft’ followed by a number. I also see that some of the contents contain the word ‘optimism’. Now is a good chance to ask the interviewer some questions. It’s always important that you ask the interviewer questions because helps you solve the problem and it helps the interviewer understand your thought process.
A good question would be, where in the content is the word ‘optimism’ located? Is it in the beginning, middle, or end of the string? It might seem obvious from looking at the table that the position of optimism doesn’t matter for this question, but asking questions can still benefit you. Currently, we’re making assumptions about the problem, and by asking questions we guarantee we’re solving the problem correctly and showing the interviewer our attention to detail. For this problem, the position and the case of ‘optimism’ do not matter.
Let’s start writing our query. Every interview problem starts with writing out the basic query. For these problems, we’re using the SELECT * statement. The SELECT * statement is used whenever you want to return all columns of information from a table. The next clause in the basic query is the FROM clause. This clause is used to choose which table we’re getting information from. We’ll add ‘FROM sql_interviews.google_file_store’. As expected, when you run this query, it returns all the information from the table.
To solve this problem, we need a way to filter the results. We do that using a WHERE clause. Understanding the WHERE clause is critical when going into a technical interview. Almost every question will involve understanding the context of the problem, and describing that context as a WHERE clause. The WHERE clause works by taking an expression, which is something that returns true or false. Each row is evaluated using the expression; if it’s true the row is returned, and it’s false it gets filtered out. We’ll need to write an expression that matches the two conditions of our problem so that we can filter out what we need. Each file has to be a draft and the contents must contain the word optimism.
Let’s deal with the first condition first. We need an expression that can do simple pattern matching. In this case, the ILIKE expression is perfect. This expression takes the name of a column and a pattern string, and only returns rows which match the pattern. Pattern strings have two special characters. The % character represents 0 or more of any character, and the _ character which represents exactly 1 of any character. Using those 2 characters we can use pattern matching to match many strings. We need to write a pattern that can match any string starting with ‘draft’. The pattern is ‘draft’ followed by 0 or more of any character so we’ll use ‘draft%’. This pattern string will match any string starting with 'draft'.
Now that we have a pattern string we can write our expression. After our WHERE clause, we’ll as a tab, and write “filename ILIKE ‘draft%’”. Now when we run the query it only returns drafts.
Now we can deal with the second condition. The second condition is that the contents contain the word ‘optimism’. We’ll add an AND expression. AND expressions allow us to have two conditions. They will check the expression before and after, and only return true if both expressions are true. Now we can add an expression for the second condition. This condition requires more pattern matching so we can use ILIKE again. The only difference is that optimism is located in the middle of the contents so we need a different pattern string. There can be zero or more characters before and after optimism so we use ‘%optimism%’ as our pattern string. We write ‘contents ILIKE ‘%optimism%’ after the AND expression.
Now that we’re filtering based on all conditions, we’ve solved the problem. I can run the query and I get the expected results.
Side note on the ILIKE expression. ILIKE has a sister expression called the LIKE expression. Both expressions work the same with one exception, the LIKE expression is case sensitive and the ILIKE expression is not. For this problem, the case of ‘draft’ and ‘optimism’ do not matter so we used ILIKE.
Question 2: Print all workers who are also managers
For our second question, we’re given two tables, worker and title, and asked to write a query that lists all of the managers.
As always, we’ll look at the table first. You’ll see that the worker table has all the information about each worker, but doesn’t list their job title. The title table lists the job title of each worker, but it only has a reference to the worker. For this problem, we need information from both tables so we will need a JOIN.
We’ll write our basic query first. SELECT * FROM sql_interviews.worker table, because we don’t know which columns we want to return yet.
Now I need to combine this table with the title table. To combine tables we use the JOIN clause. The JOIN clause is another clause that you need to know. Your interviewer will want to see that you have a solid understanding of how JOIN clauses work. JOIN clauses work by creating a table containing every possible pair from both tables and filtering that table with an ON clause. An ON clause is like a WHERE clause for JOINs. So to combine these tables we’ll add ‘JOIN sql_interviews.title’.
Now that we’re working with two tables we want to name each table with an AS clause so we can directly reference them. It’s common when writing queries to name a table after the first character in its original name. I’ll be naming the worker title ‘w’ and the title table ‘t’ so I can directly reference later in the query.
Now when we run the query we have a table that has the information about the worker and their title. We have to filter this table with an ON clause. If we could run this query without an ON clause, we would get a table that had every possible pair of worker and job title combined. Obviously, this isn’t what we want; we want every worker to be paired with their job title. We can get this table by adding our ON clause. ON clauses work the same as WHERE clauses so all we need is the correct expression. In this case, the worker_id of the worker should equal the worker_ref_id of the job title. By adding ‘ON w.worker_id = t.worker_ref_id’ we get the table we want. Now we have a table where each row has the information of a worker and their job title.
Finally, we need to filter the table such that it only contains the managers. We have two choices for filtering out the managers. We can expand our ON clause. The ON clause works the same as a WHERE clause so we can add an AND expression followed by our condition. Then we can add our second condition, “t.worker_title = ‘Manager’” and the resulting table will only have managers. That works, but we can also add a WHERE clause. The new table created by a JOIN clause works just like the original table. That means we can filter it with a WHERE clause. Just add the WHERE clause with our expression. I prefer adding a WHERE clause because it makes your ON clause simple and easy to understand.
To finish the problem I’ll choose which columns to SELECT. For this problem, I only want each manager’s first name and job title. Running this query returns all of the managers, and solves the problem.
Question 3: List employees with the same salary
For our third interview question, we’re given the worker table and we’re asked to write a query that lists all the workers with the same salaries.
To solve this problem we need to use a self-join.
As always, we first look at the table. Looking at the table we see that each row has all the information we need for the problem. It lists their worker id, name, and salary. We need to find some way to select all the pairs of workers who have the same salary.
I’ll start by writing a basic query. We’re going to SELECT * FROM sql_interviews.worker because I don’t know which columns I need. Now we’ll compare this table to itself.
Comparing a table to itself doesn’t require a new clause. If we look back to problem 2, this isn’t any different from comparing the worker table to the title table. We want every pair of workers that have the same salary. To get that we can JOIN this table with itself. To start the join we’ll add ‘JOIN sql_interviews.worker’ to our query. Now that we have two tables we need to name them. I choose to name my tables w1 for worker 1 and w2 for worker 2. Next, we’ll add our ON clause. The conditions of this problem are that the salary of the workers should be equal and we can describe that using an expression. I’ll add ‘ON w1.salary = w2.salary’ to the query. Running this query will give us a table containing every pair of workers with the same salary.
There is one issue. If we run this query, it returns more rows than expected. If you’re paying careful attention you’ll see the problem. Obviously, every worker shares the same salary as themselves. That means there is an additional row in the table for every time a worker is compared with themselves. We need to expand our ON clause so each pair must have different workers. I’ll add an AND expression followed by the new condition ‘w1.worker_id != w2.worker_id’. Now when we run this query we get a table that only contains pairs of different workers with the same salary.
Finally, we can finish our SELECT statement. For this problem, we’re interested in who has the same salary. We should return the columns for each worker’s name and the column for their salary. Running this query should return Amitah and Vivek, and Vivek and Amitah, which is correct. We’ve solved the problem.
Question 4: Find the first 5 entries of joined contacts and searches
For our fourth interview question, we’re given the tables airbnb_contacts and airbnb_searches and told to merge the tables on an appropriate key and display the first 5 results.
This problem can be more challenging because we aren’t told what to match on. It’ll require quickly forming an understanding of the tables and making a judgment based on that.
It should be noted that we’ve changed schemas from sql_interviews to datasets for this problem.
Let’s look at the tables. Immediately we see a problem, while these tables do have shared columns, the ds_checkin column, and ds_checkout column, those columns are not uniquely identifying. If a column isn’t uniquely identifying, then we can’t join on it alone. We need to understand the table better to solve this problem.
In a situation like this, it’s best to ask the interviewer questions so we don’t make incorrect assumptions. You’ll never be penalized for asking questions about the problem, so don’t worry. First, we can ask what each row represents in airbnb_contacts. The interviewer will say, each row in this table represents a contact between a guest and a host about a listing. It’s important to note that this table has the unique user id of the guest and the host. Then, we can ask what each row represents in airbnb_searches. They’ll say, each row represents a search performed by a user trying to find a listing to stay in. Given that information, we can start to form a solution. In this case, we know the users that perform the searches found in airbnb_searches are also the guests in airbnb_contacts. We can ask the interviewer if id_user column represents the same user as the id_guest column. They do, and that’s something we can JOIN on.
We’re going to write our basic query. Like before we’ll SELECT * FROM datasets.searches. Because we know we’re going to need two tables we should name our table now. I choose to name it ‘s’ for search.
This problem requires information from two tables like in previous. We’ll be using the JOIN clause again. This join is just like the JOINs we used in question two and three. Based on the questions we asked the interviewer, we know that id_user from the searches table represents the same user as id_guest in the contacts table. That’s what we’ll be joining on. We’ll JOIN with the datasets.airbnb_contacts table, naming it with an AS clause, and add ‘ON id_user = id_guest’. When we run this query it returns a table where each row represents a search that leads to contact. This is technically a solution, but we can do better.
To improve our query we need to ask our interviewer what questions they’re trying to answer with this data.
One way to improve our query is to choose the correct type of JOIN. There are two types of JOINs in SQL. We’ve been using an INNER JOIN which only returns matching pairs. We can also use an OUTER JOIN if we want information on rows that have no match. If the interviewer is only interested in matching searches where users contacted their host, then an INNER JOIN is appropriate for this query. If instead, the interviewer wanted to know the ratio of searches that resulted in contact, then an OUTER JOIN would be more appropriate. We would want to return all searches that don’t result in contact so we would use a LEFT JOIN in this case. That is the case so we’ll replace our JOIN clause with a LEFT JOIN clause.
Another way to improve our query is to make our ON clause more specific. The ds_checkin column and ds_checkout column are common across the two tables. It makes sense that adding them to the ON clause will create a more accurate representation of a user’s search intention. Some users will have multiple searches and contacts, and if we only want to return searches that lead to a contact, then each search should be for the same day as the contact. If we don’t check for this condition, then one contact in the table could result from multiple searches. We’ll improve the ON clause by adding the conditions that ds_checkin and ds_checkout are equal across the tables. Improving your ON clause in this way will show the interviewer that you have good table comprehension.
During an interview, it should be your goal to continuously improve your solutions. By asking the interviewer questions you can solve a problem exactly how they want. Even if your initial assumptions are correct and you don’t change your query, you’re still showing your interviewer that you’re through.
We only want 5 results for this problem, so we add a LIMIT clause. LIMIT clauses work by taking a number, and only returning that number of rows. For this problem, we want all the information so SELECT * works. Running this query gives us our valid solution.
Watch Our Youtube Video On These Four Questions
The last three of these problems show how important understanding JOIN clauses are. It can be challenging understanding what’s going on. If you want to practice writing SQL, I recommend joining Strata Scratch. You’ll have access to over 450 questions taken directly from real companies and you can use them to prepare yourself for an interview.
If you’ve made it this far, and you still haven’t seen the related video, then I highly recommend watching it now. The visuals will make understanding this material easier.
The twenty-first century is the age of information. The internet is now an essential part of human life, and some countries even see Right to Internet Access as inalienable. Knowledge is power, and information is the lifeblood of today’s world.
This level of connectivity has drastically changed the lifestyle of our generation. People are now more accustomed to using services online. Work, Shopping, Banking, and even social interactions are now ruled by the internet. In this day and age, an online presence is part and parcel of a healthy and interactive lifestyle.
The Growing Demand For Data Scientists
Data Scientists are in huge market today in all sectors. As computers have disrupted every primary industry in the world, experts on the subject are sought after in all areas. The same goes for Data Analytics, including but not restricted to Big Data.
All industries work based on and generate some amount of information regarding their products or their customers — for example, the healthcare industry. Terabytes of data related to innovations, medication, and patients are generated every year in this industry from research as well as day to day operations of establishments. The same can be extrapolated to any trades. Hence, Data Analysts and Scientists are needed in all fields in some capacity.
The Hottest Job Profile
It is no surprise that Data Scientist is a profession that is in demand in all fields. Every sector has its own share of digitization, and Data Analysts and Scientists are needed to look after their online presence and also make the most out of the virtual resources they have at their disposal. This is why Data Science has blown up as a hot topic in all sectors.
As far as the statistics go, all jobs that fit this profile demand basic computer programming skills as a prerequisite. It used to be that knowing SQL served as an added bonus point that put you above the competition. But the times have changed, and SQL has gone from an additional skill to a pre-requisite. It is widely accepted as the industry standard in domain-specific coding and is an unavoidable tool in the arsenal of every analyst.
SQL in Data Sciences
With Data science disrupting every industry, the role of a Data Scientist is no longer just restricted to Computer Science. There is more demand for analysts, and their work is more oriented towards a practical purpose than research or programming. Their role is to work with the data generated by their respective industry, and this is where SQL programming proves to be useful.
SQL belongs to a class of programming languages called Declarative Programming, a language that uses declarations for coding and commands. The writing itself is easy to learn and understand, and there are not many commands to learn. It is a language that is more practical than others and can be mastered by those from non-technical backgrounds as well.
“Structured” Query Language
Basic SQL proficiency is required for Data Analysts due to the nature of the job. Most businesses find it easy to perceive data in a spreadsheet or table format, which calls for a language that works based on that structure. SQL fits this description and hence is widely accepted as the industry standard.
The spreadsheet format in which SQL structures data is similar to MS Excel or other spreadsheet programs, that are popular in business as well as management circles for storing and analyzing data manually. This popularity is exploited in SQL.
Data Analytics In Other Industries
As explored earlier, the rising demand for Data Analysts in other industry is because of the disruption caused by the IT Industry in these sectors. Computers here are mainly used for storing and processing data that are harder to document manually. It also performs tasks like monitoring, tracking, performing simulations, designing, billing and so on. Almost all processes that are done using computers involve some form of Data generation, which has to be stored for later study.
In any case, most businesses end up generating and storing data in a tabular format. This is how SQL penetrated these industries, as it was created for this very purpose, of analyzing and handling such data.
SQL Proficiency: Staying A Step Ahead
SQL is a tool that has been relevant in the industry for decades and is not about to go obsolete anytime soon. Therefore, mastering it is in the best interest of all hopefuls that aspire to be a Data Scientist. SQL can be learned like any other subject, and the resources are all available online. SQL courses, SQL Problem sets, and SQL exercises are available online for studying and practice purposes.
SQL is like any other tool, and hence it serves best when it is at maximum sharpness. Therefore, regular practice is necessary to become proficient in the subject. Also, it is essential that every aspirant learns about the latest developments, and keeps expanding their knowledge on the subject.
To take your data analytics career to the next level, visit https://www.stratascratch.com/.