Airbnb Data Scientist Interview Guide
Preparation guide for Data Scientist interviews at Airbnb
Airbnb has been one of the fastest-growing tech companies in the previous decade. Their 2020 initial public offering (IPO) was one of the most anticipated events of the year. As of today, they represent a global technological giant and a company that is valued at around 90 billion USD. But how much of that success can be contributed to data science and how much emphasis does Airbnb put on this field?
In a 2018 interview, one of the Airbnb Vice Presidents has said: “The importance of data as a philosophy was baked into the DNA of Airbnb from the beginning. Our seventh hire was actually a data scientist and we invested in building a modern data architecture and infrastructure early on.” Taking that into consideration, it is no wonder that landing a data scientist job there has become a prestigious trait.
With that much success and growth over the previous years, and considering the importance of data science at this company, we have decided to do some research and find out how to get a job in the data science field at Airbnb. We have compiled a database of real-life interview questions being asked at Airbnb data scientist interviews and we will share our findings in this article. With this expanded knowledge of the interview process at Airbnb, we hope to guide you on your journey to secure a data science role at this company.
Description and Methodology of the Analysis
The goal of this article is to identify all types of questions being asked in an Airbnb data scientist interview. Furthermore, this article will analyze the questions being asked from a technical concept standpoint, in order to aid the reader in understanding the background of the specific questions and how to better prepare for them.
For the purpose of this article, we have gathered questions from various websites, job search boards and company review platforms such as Glassdoor, Indeed, Reddit and Blind App. A total of 25 real-life Airbnb interview questions have been gathered from sources mentioned above, for the period that covers the past 4 years. The most important gathered data points that we will use for this article are company name, question type(s) and description of the questions themselves.
The question type data in our research has been produced by sectioning questions into pre-determined categories. These categories have been produced by an expert analysis of the interview experience description taken from our sources. Although our classification system segregates questions into 9 different categories, the interview questions gathered from Airbnb only cover 5 of them; those are: coding, technical, business case, product and modelling. We will go into more detail explaining these categories further in this article.
Data Scientist Interviews at Airbnb
In this section, we will talk about our categorization method and how the questions were structured for the purpose of the analysis. We will also analyze all the questions and see which categories are dominant in the interview process. Furthermore, we will talk about all categories in more depth, with emphasis on their importance in data science interviews. For each category, we will go through real-world examples of questions being asked on Airbnb data science interviews and in order to help you prepare for questions of the same or similar structure.
Airbnb Question Type Breakdown
Let’s first analyze how the Airbnb interview questions gathered for this article are categorized.
As we can see, coding questions make up the majority (two thirds) of all questions being asked on Airbnb data scientist interviews. For this reason, we will allocate more attention to coding in this article, with the goal to help you prepare better for questions that make up the majority of the data scientist interview process at Airbnb. Another prominent category is business case type of questions, which make up 17% of all questions being asked at Airbnb data scientist interviews. All other categories (product, modelling and technical) make up a total of 16% of questions being asked at Airbnb data scientist interviews. Now, let’s dig into the categories themselves and analyze the question types through specific examples.
Using our categorization method, we have identified coding questions to be all types of questions which involve data manipulation and analysis through code. This manipulation through code is usually done (or asked to be done) using some of the most popular languages in data science: SQL, Python and R.
An example of a coding question specific to Airbnb would be:
- “Find the average accommodates-to-beds ratio for shared rooms in each city. Sort your results by listing cities with the highest ratios first.”
Let us also discuss the possible solution to this question in Postgres, in order to get you more familiarized with how you can approach a similar issue on the interview:
- First, in order to calculate the accommodates-to-bed ratio, use the `accommodates` and `beds` columns in the table. The ratio is defined as accommodates/beds;
- Then, use the AVG() function in the pattern of AVG(accommodates/beds) to find the requested average ratio;
- After that, use the WHERE clause for the room type 'Shared room' to narrow down the required records;
- Use the GROUP BY clause to group the result by the city;
- Finally, make sure you cast the output to a decimal or float data type to get a non-integer.
Here is the entire solution in Postgres, from our platform:
SELECT city, AVG (cast(accommodates AS float) / cast(beds AS float)) AS avg_crowdness_ratio FROM airbnb_search_details WHERE room_type='Shared room' GROUP BY city ORDER BY avg_crowdness_ratio DESC
Below are few more examples of a real-life Airbnb coding question that could be asked in an interview:
- “You're given a dataset of searches for properties on Airbnb. For simplicity, let's say that each search result (i.e., each row) represents a unique host. Find the city with the most amenities across all their host's properties. Output the name of the city.”
A possible solution can be found here.
- “To better understand the effect of the review count on the price, categorize the number of reviews into the following groups along with the price."
- 0 reviews: NO
- 1 to 5 reviews: FEW
- 6 to 15 reviews: SOME
- 16 to 40 reviews: MANY
- more than 40 reviews: A LOT”
A possible solution can be found here.
These types of questions are being asked on Airbnb data scientist interviews in order to test the candidate’s problem-solving skills, attention to detail and knowledge of programming languages that may be used in their day-to-day roles. As we could see from the graph above, the importance of knowing how to answer coding questions can never be overstated, since they make up the majority of all questions being asked.
In our analysis, we have identified business case questions as all questions involving case studies as well as all questions which require utilization of data science skills that are related to the business operations. An example of a business case question specific to Airbnb would be:
- “How would you measure the effectiveness of our operations team?”
These types of questions are particularly important on Airbnb data science interviews; as we can see above, they are the second most common question category in our analysis.
Due to the nature of this question type, we could not identify a single technical concept that is most prevalent in this category, since business cases can be approached in numerous ways and they depend on the creativity of the interviewee. However, we have identified a common pattern among our business case questions where Airbnb gives take-home business case tasks as part of the interview process. According to our research, these business cases are somewhat complex and they usually take 2-3 days to complete. For these business case assignments, the interviewee is usually given a dataset, asked to analyze the data and recommend 2-3 product changes. Furthermore, the interviewee is expected to rank their recommendations based on estimated impact on the business and present the main findings of their analysis in a short presentation. This type of assignment is designed to test the majority of the day-to-day responsibilities a person in a data science role would encounter.
Product, Modeling and Technical
Due to a lower representation of interview questions categorized under product, modeling and technical, we will cover all these categories under one section. Nevertheless, these categories are still quite important in the interview process for data science roles. However, as our Airbnb research and database of interview questions have shown the prevalence of coding and business case questions, we decided to focus more on those two categories in this article, with the goal of aiding the reader in navigating the interview process better.
Product interview questions are categorized as all data science questions being related to the specific product the company is offering. An example of an Airbnb product interview question would be: “An important metric goes down, how would you dig into the causes?” The ability to answer product questions is very important on data science interviews as it demonstrates not only your data science skills, but your knowledge of the company and its offerings as well.
Modeling interview questions have been categorized as all questions related to machine learning and regressions. For example, Airbnb modeling interview question could be: “What would happen if you were to include another X (variable) in your regression model?” Depending on your role in the data science team, these questions can be of crucial importance as a lot of your day-to-day responsibilities could be related to building and adjusting statistical models and regressions.
Interview questions categorized under technical are generally questions which test the theoretical knowledge of various data science concepts that can appear in your day-to-day roles. For example, an Airbnb technical interview question can be:
- “How would you impute missing information?”
Data scientists need to know how to answer technical questions on interviews as that shows that their theoretical background is solid and they are able to easily implement and explain certain concepts of the top of their heads.
Technical Concepts Tested in Airbnb Data Scientist Interviews
As we have seen in this article so far, there is a heavy emphasis on the data science coding interview questions in Airbnb interviews, with a significant proportion of business case questions as well. In this section, we will be covering some of the most prevalent technical concepts tested at Airbnb.
The most prominent technical concept tested was writing SQL queries and using the GROUP BY function to produce results from certain column(s). An example of this type of question asked at an Airbnb data science interview is:
- “Find the total number of searches for each room type (apartments, private, shared) by city.”
To answer this question, you would need to query the room type column to count the number of times each of the specified room type (apartments, private and shared) searches appear. Then you would use the GROUP BY function to group the results of your query based on the city and room type. A detailed answer to this question (along with the code) can be found here.
Another important technical concept mentioned in Airbnb data science interviews is the usage of CASE/ELSE statements. For example, you could be asked this question:
- “Display the number of times a user performed a search which led to a successful booking and the number of times a user performed a search but did not lead to a booking. The output should have a column named action with values 'does not book' and 'books' as well as a 2nd column named average_searches with the average number of searches per action. Consider that the booking did not happen if the booking date is null.”
The answer to this question (using CASE/ELSE statements) can be found here.
Finally, the last prevalent technical concept tested in Airbnb data science interviews is the usage of joins. You can be asked to performed data manipulation using inner or outer joins (or both). Here is an example question where you could apply inner joins to solve the problem:
- “Write a query to find which gender gives a higher average review score when writing reviews as guests. Use the `from_type` column to identify guest reviews. Output the gender and their average review score.”
The answer to this question can be found here.
In conclusion, the most prevalent technical concepts that are being tested at Airbnb are grouping, case/else statement, and joins. You can find more examples of questions with these technical concepts here, should you wish to better prepare for your upcoming interview.
Our analysis has been comprised of real-life Airbnb data scientist interview questions over the past 4 years. We have categorized all questions into 5 different sections and examined the prevalence, importance and real-life examples for each of them, specific to Airbnb. As part of our analysis, we have talked about some of the most common technical concepts that can be encountered under the categories, and we have given considerable emphasis to coding questions as they make up the vast majority of all questions at Airbnb data scientist interviews.
As Airbnb grows to become one of the leading technology giants of the world, it is clear why the company attracts more applications for data science roles every year. We have conducted our research and written this article in hopes of guiding you towards securing a data science role at Airbnb, should you choose to pursue working at one of the largest tech companies out there.