Assignment Solutions: Basic SQL 1
Take a look at the solutions and explanations to the Assignment: Basic SQL 1.
Hope you had fun solving the Basic SQL 1 questions and that they were not too difficult. Now we’ll go through every concept tested in the assignment, show you the solutions, and explain them.
Filtering Data Using One Condition in WHERE
The WHERE clause in SQL is used for filtering data before the aggregation. The query will return all the records that satisfy the condition stated in the WHERE clause.
These are the operators you can use in WHERE:
|<=||Less than or equal|
|>=||Greater than or equal|
|<>||Does not equal|
|BETWEEN||Between the specified range|
|LIKE||Looking for a pattern|
|IN||Equals the values listed in the parentheses|
Note that the WHERE clause in PostgreSQL is case-sensitive.
Find all inspections which are part of an inactive program
“Find all inspections which are part of an inactive program.”
The data you have here is in the table los_angeles_restaurant_health_inspections.
Here’s part of the data form the table, so you are familiar with it.
To write the solution, you need to select all the columns from the table and filter the output by the column program_status.
SELECT * FROM los_angeles_restaurant_health_inspections WHERE program_status = 'INACTIVE';
Use the WHERE clause to show only data with the inactive program status. Since the WHERE clause is case-sensitive, the status name has to be written in all capitalized. That’s how it’s recorded in the source table.
You should get this output by running the code.
Finding the Unique Values and Filtering Data Using More Than One Condition in WHERE
The DISTINCT statement returns only the unique values from the table. In other words, the duplicate rows will appear only once.
When there’s a need to filter data on more than one criteria, the keyword that needs to be used is AND.
State the first criteria in the WHERE clause, then write AND to write the second criteria, and so on. You’ll get a chain of criteria separated by the AND keyword. Note that you write WHERE only once.
Also, apart from using the AND operator for filtering on two or more criteria, you can use OR and NOT operators.
London Olympic Swimmers
“Find the athletes who competed in swimming events at the London Olympics.”
Link to the question: https://platform.stratascratch.com/coding/9934-london-olympic-swimmers
The question gives you the table olympics_athletes_events to work with.
This is a preview of data in the table.
The athletes can compete more than once in the Olympics, so it’s wise to assume the athletes are showing more than once in the table. Use DISTINCT to show the athletes only once.
The output has to be filtered to show only swimming events, and the city should be London.
SELECT DISTINCT name FROM olympics_athletes_events WHERE sport = 'Swimming' AND city = 'London';
Run the code to get the output.
Filtering Out the NULL Values and Sorting Data
The output will not show the NULL values when the column specified in the WHERE clause is followed by ‘IS NOT NULL’.
If you want to show only the NULL values, then you have to write ‘IS NULL’.
The code output is sorted using the ORDER BY clause, where you explicitly state the column to be used for ordering data. After the column name, use the ASC keyword to sort data in ascending order. For descending order, use the keyword DESC.
It’s also possible to sort data by more than one column. To do that, simply separate the columns in the ORDER BY clause with a comma.
Find all searches for San Francisco with a flexible cancellation policy and a review score rating
“Find all searches for San Francisco with a flexible cancellation policy and a review score rating. Sort the results by the review score in the descending order.”
There’s again one table: airbnb_search_details.
This is what data really looks like.
You need to show all the columns in the SELECT statement.
There will be three filtering criteria in the WHERE clause: the city is San Francisco, the cancellation policy is flexible, and there has to be a review score rating.
When looking for data that is not NULL. The command that will return non-NULL values is ‘IS NOT NULL’.
Finally, the output has to be in descending order according to the review scores rating.
SELECT * FROM airbnb_search_details WHERE city='SF' AND cancellation_policy='flexible' AND review_scores_rating IS NOT NULL ORDER BY review_scores_rating DESC;
The result is only one row.
Using LIMIT to Filter Data
One way of filtering data in the output is using the LIMIT keyword. Unlike the WHERE clause, LIMIT doesn’t filter data based on its value.
The purpose of LIMIT is to limit the output to a certain number of rows.
Hour Of Highest Gas Expense
“Find the hour with the highest gasoline cost. Assume there's only 1 hour with the highest gas cost.”
Link to the question: https://platform.stratascratch.com/coding/10005-hour-of-highest-gas-expense
The table is lyft_rides with five columns.
This is a sample of data from the table.
The query will select only the column hour from the table. If data is ordered by the gasoline cost in descending order, the output will show hours sorted from the highest to the lowest gas price.
To show only the highest price, you need to show one row and there it is.
SELECT hour FROM lyft_rides ORDER BY gasoline_cost DESC LIMIT 1;
The answer to the question is the 10th hour.
LIKE & ILIKE in the WHERE Clause
LIKE is a standard SQL operator used in the WHERE clause to find the patterns in values. What does that mean? All the previous operators used in WHERE were looking for the exact values. When you use LIKE, you’re looking for something that is, well, like something. For example, when looking for only a specific word or a particular letter appearing in the whole text.
The ILIKE operator is PostgreSQL-specific. It works the same way as LIKE; only it’s not case-sensitive.
The wildcard characters are also used with those operators since they help look for a part of the value. Most often, it’s '%' wildcard characters used in the following way.
|'%word'||Finds values starting with 'word'|
|'word%'||Finds values ending with 'word'|
|'%word%'||Finds values containing 'word' in any position|
Find drafts which contains the word 'optimism'
“Find drafts which contain the word 'optimism'.”
The data is stored in the table google_file_store.
It’s a simple table with only three rows.
Select all the columns from the table. The file name has to contain the word 'draft'. We know if the file is a draft, it will begin with the word 'draft'. Therefore, we need to put the wildcard character only at the end.
Also, the contents have to contain the word 'optimism'. It can be in the beginning, at the end, or in the middle of the content. Because of that, the wildcard is needed on both sides of the filtering criteria.
SELECT * FROM google_file_store WHERE filename ILIKE 'draft%' AND contents ILIKE '%optimism%';
The output will be only one draft.
Using WHERE With IN
The IN operator is shorthand for multiple OR operators. The list of criteria is listed in the parentheses, and the where clause looks if the value matches one of those in the list.
Find songs that are ranked between 8-10
“Find songs that are ranked between 8-10.
Output the track name along with the corresponding position ordered ascendingly.”
Link to the question: https://platform.stratascratch.com/coding/9999-find-songs-that-are-ranked-between-8-10
The table you’ll work with is spotify_worldwide_daily_song_ranking.
Here’s the data from the table.
First, select the track name and the position. We’re looking for songs in the 8th, 9th, or 10th position. These criteria are written in the parentheses after the IN operator.
SELECT trackname, position FROM spotify_worldwide_daily_song_ranking WHERE POSITION IN (8, 9, 10) ORDER BY position;
Here are some of the songs that comprise the output.
Aggregate Functions & GROUP BY
The aggregate functions are the types of functions in SQL that perform calculations on a set of rows and return only one value.
The most commonly used aggregate functions are
The aggregate functions are commonly used with the GROUP BY clause. Using it is necessary whenever there’s any other column in the SELECT statement apart from the aggregate function. In other words, all these columns (except the aggregate function) have to be listed in the GROUP BY clause. Otherwise, it will throw an error.
Note that the columns that appear in the GROUP BY don’t have to appear in the SELECT statement. This means you can group by the columns, but it doesn’t mean you must show these columns in the output.
Finding Updated Records
“We have a table with employees and their salaries, however, some of the records are old and contain outdated salary information. Find the current salary of each employee assuming that salaries increase each year. Output their id, first name, last name, department ID, and current salary. Order your list by employee ID in ascending order.”
Link to the question: https://platform.stratascratch.com/coding/10299-finding-updated-records
Here, you’ll work with the table ms_employee_salary.
Here are the first few rows from the table.
In the query, select all the columns except the salary. To find the highest salary, use the MAX() function. Data needs to be grouped by all the columns in the SELECT statement, except for the aggregate function.
Finally, data is ordered by the employee’s ID from lowest to highest.
SELECT id, first_name, last_name, department_id, max(salary) AS max_salary FROM ms_employee_salary GROUP BY id, first_name, last_name, department_id ORDER BY id ASC;
The query will return the following data.
Filtering Data Using HAVING
The HAVING clause has the same purpose as WHERE; it filters data. What you do in the where clause, you can do the same in the HAVING clause.
However, they are not the same. The HAVING clause is used for filtering data after the aggregation, whereas the WHERE clause filters data before the aggregation.This is also reflected in the code: WHERE is written before GROUP BY, while HAVING comes after GROUP BY.
Also, the aggregate functions are allowed in the HAVING clause. In WHERE they are not.
3 Bed Minimum
“Find the average number of beds in each neighborhood that has at least 3 beds in total.
Output results along with the neighborhood name and sort the results based on the number of average beds in descending order.”
Link to the question: https://platform.stratascratch.com/coding/9627-3-bed-minimum
This question, too, uses only one table: airbnb_search_details.
Here’s the sample data.
In the solution, you should select the column neighborhood and then use the AVG() function to calculate the average number of beds by neighborhood.
The output has to show only the neighborhood with at least 3 beds, so this criteria has to appear in the HAVING clause. To get the number of beds, use the SUM() function.
Finally, order data from the highest to the lowest average number of beds.
SELECT neighbourhood, AVG(beds) AS n_beds_avg FROM airbnb_search_details GROUP BY neighbourhood HAVING SUM(beds) >= 3 ORDER BY n_beds_avg DESC;
Here’s the solution output.