Assignment Solutions: Basic SQL 1

Basic SQL Assignment Solutions


Take a look at the solutions and explanations to the Assignment: Basic SQL 1.

Hope you had fun solving the Basic SQL 1 questions and that they were not too difficult. Now we’ll go through every concept tested in the assignment, show you the solutions, and explain them.

Filtering Data Using One Condition in WHERE

The WHERE clause in SQL is used for filtering data before the aggregation. The query will return all the records that satisfy the condition stated in the WHERE clause.

These are the operators you can use in WHERE:

OperatorMeaning
=Equals
<Less than
>Greater than
<=Less than or equal
>= Greater than or equal
<>Does not equal
BETWEEN Between the specified range
LIKELooking for a pattern
INEquals the values listed in the parentheses

Note that the WHERE clause in PostgreSQL is case-sensitive.

Question 1

Find all inspections which are part of an inactive program

“Find all inspections which are part of an inactive program.”

Link to the question: https://platform.stratascratch.com/coding/10277-find-all-inspections-which-are-part-of-an-inactive-program

Data

The data you have here is in the table los_angeles_restaurant_health_inspections.

serial_numbervarchar
activity_datedatetime
facility_namevarchar
scoreint
gradevarchar
service_codeint
service_descriptionvarchar
employee_idvarchar
facility_addressvarchar
facility_cityvarchar
facility_idvarchar
facility_statevarchar
facility_zipvarchar
owner_idvarchar
owner_namevarchar
pe_descriptionvarchar
program_element_peint
program_namevarchar
program_statusvarchar
record_idvarchar

Here’s part of the data form the table, so you are familiar with it.

Find all inspections which are part of an inactive program

Solution

To write the solution, you need to select all the columns from the table and filter the output by the column program_status.

SELECT *
FROM los_angeles_restaurant_health_inspections
WHERE program_status = 'INACTIVE';

Use the WHERE clause to show only data with the inactive program status. Since the WHERE clause is case-sensitive, the status name has to be written in all capitalized. That’s how it’s recorded in the source table.

You should get this output by running the code.

Find all inspections which are part of an inactive program

Finding the Unique Values and Filtering Data Using More Than One Condition in WHERE

The DISTINCT statement returns only the unique values from the table. In other words, the duplicate rows will appear only once.

When there’s a need to filter data on more than one criteria, the keyword that needs to be used is AND.

State the first criteria in the WHERE clause, then write AND to write the second criteria, and so on. You’ll get a chain of criteria separated by the AND keyword. Note that you write WHERE only once.

Also, apart from using the AND operator for filtering on two or more criteria, you can use OR and NOT operators.

Question 2

London Olympic Swimmers

“Find the athletes who competed in swimming events at the London Olympics.”

Link to the question: https://platform.stratascratch.com/coding/9934-london-olympic-swimmers

Data

The question gives you the table olympics_athletes_events to work with.

idint
namevarchar
sexvarchar
agefloat
heightfloat
weightdatetime
teamvarchar
nocvarchar
gamesvarchar
yearint
seasonvarchar
cityvarchar
sportvarchar
eventvarchar
medalvarchar

This is a preview of data in the table.

London Olympic Swimmers

Solution

The athletes can compete more than once in the Olympics, so it’s wise to assume the athletes are showing more than once in the table. Use DISTINCT to show the athletes only once.

The output has to be filtered to show only swimming events, and the city should be London.

SELECT DISTINCT name
FROM olympics_athletes_events
WHERE sport = 'Swimming'
  AND city = 'London';

Run the code to get the output.

London Olympic Swimmers

Filtering Out the NULL Values and Sorting Data

Filtering Out the NULL Values and Sorting Data

The output will not show the NULL values when the column specified in the WHERE clause is followed by ‘IS NOT NULL’.

If you want to show only the NULL values, then you have to write ‘IS NULL’.

The code output is sorted using the ORDER BY clause, where you explicitly state the column to be used for ordering data. After the column name, use the ASC keyword to sort data in ascending order. For descending order, use the keyword DESC.

It’s also possible to sort data by more than one column. To do that, simply separate the columns in the ORDER BY clause with a comma.

Question 3

Find all searches for San Francisco with a flexible cancellation policy and a review score rating

“Find all searches for San Francisco with a flexible cancellation policy and a review score rating. Sort the results by the review score in the descending order.”

Link to the question: https://platform.stratascratch.com/coding/9621-find-all-searches-for-san-francisco-with-a-flexible-cancellation-policy-and-a-review-score-rating

Data

There’s again one table: airbnb_search_details.

idint
pricefloat
property_typevarchar
room_typevarchar
amenitiesvarchar
accommodatesint
bathroomsint
bed_typevarchar
cancellation_policyvarchar
cleaning_feebool
cityvarchar
host_identity_verifiedvarchar
host_response_ratevarchar
host_sincedatetime
neighbourhoodvarchar
number_of_reviewsint
review_scores_ratingfloat
zipcodeint
bedroomsint
bedsint

This is what data really looks like.

Finding all searches for San Francisco with a flexible cancellation policy

Solution

You need to show all the columns in the SELECT statement.

There will be three filtering criteria in the WHERE clause: the city is San Francisco, the cancellation policy is flexible, and there has to be a review score rating.

When looking for data that is not NULL. The command that will return non-NULL values is ‘IS NOT NULL’.

Finally, the output has to be in descending order according to the review scores rating.

SELECT *
FROM airbnb_search_details
WHERE city='SF'
  AND cancellation_policy='flexible'
  AND review_scores_rating IS NOT NULL
ORDER BY review_scores_rating DESC;

The result is only one row.

Finding all searches for San Francisco with a flexible cancellation policy

Using LIMIT to Filter Data

One way of filtering data in the output is using the LIMIT keyword. Unlike the WHERE clause, LIMIT doesn’t filter data based on its value.

The purpose of LIMIT is to limit the output to a certain number of rows.

Question 4

Hour Of Highest Gas Expense

“Find the hour with the highest gasoline cost. Assume there's only 1 hour with the highest gas cost.”

Link to the question: https://platform.stratascratch.com/coding/10005-hour-of-highest-gas-expense

Data

The table is lyft_rides with five columns.

indexint
weathervarchar
hourint
travel_distancefloat
gasoline_costfloat

This is a sample of data from the table.

Hour Of Highest Gas Expense

Solution

The query will select only the column hour from the table. If data is ordered by the gasoline cost in descending order, the output will show hours sorted from the highest to the lowest gas price.

To show only the highest price, you need to show one row and there it is.

SELECT hour
FROM lyft_rides
ORDER BY gasoline_cost DESC
LIMIT 1;

The answer to the question is the 10th hour.

Hour Of Highest Gas Expense

LIKE & ILIKE in the WHERE Clause

LIKE is a standard SQL operator used in the WHERE clause to find the patterns in values. What does that mean? All the previous operators used in WHERE were looking for the exact values. When you use LIKE, you’re looking for something that is, well, like something. For example, when looking for only a specific word or a particular letter appearing in the whole text.

The ILIKE operator is PostgreSQL-specific. It works the same way as LIKE; only it’s not case-sensitive.

The wildcard characters are also used with those operators since they help look for a part of the value. Most often, it’s '%' wildcard characters used in the following way.

ExampleDescription
'%word'Finds values starting with 'word'
'word%'Finds values ending with 'word'
'%word%'Finds values containing 'word' in any position

Question 5

Find drafts which contains the word 'optimism'

“Find drafts which contain the word 'optimism'.”

Link to the question: https://platform.stratascratch.com/coding/9805-find-drafts-which-contains-the-word-optimism

Data

The data is stored in the table google_file_store.

filenamevarchar
contentsvarchar

It’s a simple table with only three rows.

Find drafts which contains the word 'optimism'

Solution

Select all the columns from the table. The file name has to contain the word 'draft'. We know if the file is a draft, it will begin with the word 'draft'. Therefore, we need to put the wildcard character only at the end.

Also, the contents have to contain the word 'optimism'. It can be in the beginning, at the end, or in the middle of the content. Because of that, the wildcard is needed on both sides of the filtering criteria.

SELECT *
FROM google_file_store
WHERE filename ILIKE 'draft%'
  AND contents ILIKE '%optimism%';

The output will be only one draft.

Find drafts which contains the word 'optimism'

Using WHERE With IN

Using WHERE With IN

The IN operator is shorthand for multiple OR operators. The list of criteria is listed in the parentheses, and the where clause looks if the value matches one of those in the list.

Question 6

Find songs that are ranked between 8-10

“Find songs that are ranked between 8-10.

Output the track name along with the corresponding position ordered ascendingly.”

Link to the question: https://platform.stratascratch.com/coding/9999-find-songs-that-are-ranked-between-8-10

Data

The table you’ll work with is spotify_worldwide_daily_song_ranking.

idint
positionint
tracknamevarchar
artistvarchar
streamsint
urlvarchar
datedatetime
regionvarchar

Here’s the data from the table.

Find songs that are ranked between 8-10

Solution

First, select the track name and the position. We’re looking for songs in the 8th, 9th, or 10th position. These criteria are written in the parentheses after the IN operator.

SELECT trackname,
       position
FROM spotify_worldwide_daily_song_ranking
WHERE POSITION IN (8,
                   9,
                   10)
ORDER BY position;

Here are some of the songs that comprise the output.

Find songs that are ranked between 8-10

Aggregate Functions & GROUP BY

The aggregate functions are the types of functions in SQL that perform calculations on a set of rows and return only one value.

The most commonly used aggregate functions are

  • COUNT()
  • SUM()
  • AVG()
  • MIN()
  • MAX()

The aggregate functions are commonly used with the GROUP BY clause. Using it is necessary whenever there’s any other column in the SELECT statement apart from the aggregate function. In other words, all these columns (except the aggregate function) have to be listed in the GROUP BY clause. Otherwise, it will throw an error.

Note that the columns that appear in the GROUP BY don’t have to appear in the SELECT statement. This means you can group by the columns, but it doesn’t mean you must show these columns in the output.

Question 7

Finding Updated Records

“We have a table with employees and their salaries, however, some of the records are old and contain outdated salary information. Find the current salary of each employee assuming that salaries increase each year. Output their id, first name, last name, department ID, and current salary. Order your list by employee ID in ascending order.”

Link to the question: https://platform.stratascratch.com/coding/10299-finding-updated-records

Data

Here, you’ll work with the table ms_employee_salary.

idint
first_namevarchar
last_namevarchar
salaryint
department_idint

Here are the first few rows from the table.

Finding Updated Records

Solution

In the query, select all the columns except the salary. To find the highest salary, use the MAX() function. Data needs to be grouped by all the columns in the SELECT statement, except for the aggregate function.

Finally, data is ordered by the employee’s ID from lowest to highest.

SELECT id,
       first_name,
       last_name,
       department_id,
       max(salary) AS max_salary
FROM ms_employee_salary
GROUP BY id,
         first_name,
         last_name,
         department_id
ORDER BY id ASC;

The query will return the following data.

Finding Updated Records

Filtering Data Using HAVING

The HAVING clause has the same purpose as WHERE; it filters data. What you do in the where clause, you can do the same in the HAVING clause.

However, they are not the same. The HAVING clause is used for filtering data after the aggregation, whereas the WHERE clause filters data before the aggregation.This is also reflected in the code: WHERE is written before GROUP BY, while HAVING comes after GROUP BY.

Also, the aggregate functions are allowed in the HAVING clause. In WHERE they are not.

Question 8

3 Bed Minimum

“Find the average number of beds in each neighborhood that has at least 3 beds in total.

Output results along with the neighborhood name and sort the results based on the number of average beds in descending order.”

Link to the question: https://platform.stratascratch.com/coding/9627-3-bed-minimum

Data

This question, too, uses only one table: airbnb_search_details.

idint
pricefloat
property_typevarchar
room_typevarchar
amenitiesvarchar
accommodatesint
bathroomsint
bed_typevarchar
cancellation_policyvarchar
cleaning_feebool
cityvarchar
host_identity_verifiedvarchar
host_response_ratevarchar
host_sincedatetime
neighbourhoodvarchar
number_of_reviewsint
review_scores_ratingfloat
zipcodeint
bedroomsint
bedsint

Here’s the sample data.

Find the average number of beds in each neighborhood

Solution

In the solution, you should select the column neighborhood and then use the AVG() function to calculate the average number of beds by neighborhood.

The output has to show only the neighborhood with at least 3 beds, so this criteria has to appear in the HAVING clause. To get the number of beds, use the SUM() function.

Finally, order data from the highest to the lowest average number of beds.

SELECT neighbourhood,
       AVG(beds) AS n_beds_avg
FROM airbnb_search_details
GROUP BY neighbourhood
HAVING SUM(beds) >= 3
ORDER BY n_beds_avg DESC;

Here’s the solution output.

Find the average number of beds in each neighborhood







Basic SQL Assignment Solutions


Become a data expert. Subscribe to our newsletter.