How to Solve Data Science Business Case Interview Questions
The Ultimate Guide to Preparing Business Case Interview Questions as a Data Scientist
Approaching Data Science Business Case Interview Questions
Similar to product sense interview questions, business case interview questions in data science are asked to understand your thought process behind your solution. Even if your solution provides an accurate answer to the question, if you do not give proper steps to your answer, this will not look good on the interviewee's part.
Business case interview questions are asked for:
- Ability to diagnose and solve real business case problems
- Understand how familiar the interviewee is with the economy/business surrounding the company’s products
- Feasibility of the interviewee’s solution
- Effective communication of the solution in a structured manner
(All these qualities in your answer are part of the data science job, so try to make sure you keep these in mind)
To effectively provide the solution to the business case interview questions presented, it is important to understand which category the question is categorized under.
- Applied Data (most common)
- Theory Testing
- Brief explanation
- Specific examples and generalized examples
- Companies look for: (in your answer)
These questions ask you to solve a specific business problem by leveraging company data or from external sources.
Examples of these questions:
- DemystData asked “How would a financial institution determine if an applicant makes more or less than $50k/year?”
- Facebook asked “How many high schools that people have listed on their Facebook profiles are real? How do we find out, and deploy at scale, a way of finding invalid schools?”
Companies look for:
- How well the interviewee can identify and define relevant data
- How well the interviewee understands the product in question
- How well the interviewee understands the business/economy surrounding the product
These questions ask to predict the number of products sold/exist. These are the questions that seem random and have no relevance to the company.
Examples of these questions:
- Google asked “How many cans of blue paint were sold in the United States last year?”
- Ebay asked “What is the total length of all the roads in San Francisco?”
Companies look for:
- How well the interviewee understands the product in question
- How well the interviewee can identify the target market
These questions are asked to prove/disprove a theory that usually involves a change in a product/feature within the company.
Example of this question:
- If a PM says that they want to double the number of ads in the News Feed, how would you figure out if this is a good idea or not?
Companies look for:
- How well the interviewee understands the impact of the changed product/feature
- How well the interviewee understands the relationship between the product and the target audience
- Choose relevant metrics to track whether the change is a success or not
After classifying the business case interview question posed, you should start to structure your solution. Do not answer the question immediately. Understand that even if you provide a good answer, interviewers want to understand how you arrived at your answer.
The interviewer looks for:
- A solution with a systematic approach
- Why a certain solution was chosen over another
- Cover the key areas of the question
- A feasible solution
While each data science business case interview question has a different approach to the solution, part of your solution has the same methodology. This part of the solution can be treated as the background research to create an exceptional solution.
The first thing the interviewee should do is understand the question. While this may seem straightforward, the assumptions you made about the question may not be the same assumption the interviewer is looking for. Think when you are a data scientist and building the background information for a prospective client, you need to understand their needs. This is hard to capture through just one question. Take some time to think about why the company is asking this question.
Business case questions are often asked to increase: Market Share, User Engagement, or Revenue. Understand which category the company is trying to increase and depending on the response, your solution may or may not change! Even if you don’t think it will change your solution, ask just to show this is on your mind since it is an important aspect when developing an actual feature/product.
An important part of understanding the question is to understand the question’s key terms. To make sure you cover all the key terms, point out and ask if you have a proper assumption about the key terms in order from start to finish.
Example: “How many Big Macs does McDonald's sell each year in the US?”
- “Big Macs” → Assumption that Big Macs include both big macs sold by itself and part of a combo meal. Have there been any other variations of big macs sold such as through promotional/limited-time purposes?
- “Year” → Does year indicate from January 1st to December 31st (Gregorian Calendar) or the fiscal year of October 1st till September 30th?
- “US” → Does US mean just the 50 states and D.C or does it include the minor outlying islands and territories of the US?
After understanding the question thoroughly, you need to provide a solution based on the type of question asked (Applied Data, Sizing, Theory Testing).
Applied data are an extremely diverse set of questions, so it is hard to have a specific framework. This question truly tests how well you understand the company’s product and how to utilize relevant data.
When taking time to construct your solution, think about what type of data does the company have access to? Think about what internal data the company collects. When it comes to “applying data” questions, internal data is a great source, since the company usually has already collected relevant data. For example Uber asked “How do you estimate the impact that Uber has on driving conditions and congestion?” Uber has collected data related to what Uber cars are currently being used by customers. Uber also would have data on the traffic at any given time range in their supported cities. These are internal data that can be utilized to construct your solution. Do not forget about collecting data from external sources. Data can also be collected and utilized from trusted external sources but maybe a more riskier answer if you choose bad data sources.
If the data that is required to be collected are metrics, remember to mention both success metrics and guardrail metrics. Success metrics are metrics that you measure to quantify the success of the specific product. Guardrail metrics are metrics that are not supposed to be negatively affected in pursuit of changing the success metric. For example, Uber asked “What metrics would you use to track whether Uber's strategy of using paid advertising to acquire customers works?” A possible success metric could be inorganic growth of users and a guardrail metric could be the number of rides taken. While the purpose of paid advertising is probably to increase the number of users (These assumptions are the type of questions you as the interviewee would want to ask the interviewer!) Uber does not want the number of rides people take to reduce.
Once you have understood which data you require for your solution, mention how each of the data collected is relevant to your solution. This shows you understand where the chosen data will be applied and how this supports your solution.
Remember applied data questions typically involve users and the user experience. Try to think like a typical user and edge case users to construct your solution. For example Facebook asked “We at Facebook would like to develop a way to estimate the month and day of people's birthdays, regardless of whether people give us that information directly. What methods would you propose, and data would you use, to help with that task?” Imagine you are a Facebook user. You would receive birthday wishes on Facebook through direct messages or tagged Facebook posts. A possible solution would be to check the number of users who tag you in a post mentioning keyterms such as ‘birthday’ or ‘bday’. These posts tend to be more during your actual birthday, so Facebook could estimate your birthday based on this.
Sizing questions may seem extremely random and diverse, but it is easier to break down how to approach the solution. Sizing questions can also be called guesstimation, since you are trying to make an estimated guess. Sizing questions are asked to see how well you can identify the target audience for a specific product. For example, Google asked “How many cans of blue paint were sold in the United States last year?” Google wants to see how many target audiences you can identify and furthermore see if you can approximate how many blue paint cans each of these consumers will purchase.
A key point to remember is that you do not have to try and give a hyper-accurate number of how many blue paint cans a certain target audience will buy. The interviewer probably does not know the exact answer either, they just want to hear your thought process behind your answer.
Since sizing questions are trying to identify the target audience, these are some common splitting factors among potential users (not limited to):
Another factor to consider is the consumption type of the product/service. There are 3 consumption types: Individual, Household, Structural.
- Individual refers to the personal consumption of products/services, such as toothbrushes, water bottles, or tshirts.
- Household refers to consumption by the entire household, such as cars, TVs, and refrigerators.
- Structural refers to the consumption by multiple people from different households. Examples include airplanes and restaurants.
Initially state a couple of target audiences (using the filters to help), then mention target audiences you will focus on who represent a majority of the market.
Generally, it is a good idea to involve at least 3 filters of who would use the product/service. Always explain the relevance between the filter and the target audience. For example, TikTok would be better suited to be advertised to a younger audience between ages of 13-18 rather than elderly people who are 65+.
As mentioned before, interviewers do not look for precise numbers, so try to use round numbers as much as possible. Let’s build on the TikTok example, and assume 20% of the US population is between the ages of 13-18. The population of the US is around 333,548,370. 20% of 333,543,000 is hard to calculate on the spot during an interview, especially if you are nervous. If you round the population to 300 million, it is much easier to calculate 20% of 300 million.
There is more than one method people can use to estimate the size, but after analyzing various methods one method stuck out especially for data science business case interviews. The method is creating a layout of types of consumers that will use the product and fill out an approximate number of people who will use the product.
To explain this method better, let’s use a Facebook interview question “How many Big Macs does McDonald's sell each year in the US?”.
Let’s first take a rounded number for the US population, which we shall assume around 300 million.
Main target audiences for big macs would typically be older than the target audience for kids meals and younger than older people. The main target audiences from this would be college students, families, and people trying to buy a cheap meal.
With respect to college students, let’s assume 10% of the population are college students. 10% of 300M = 30M. Let’s assume that an average college student buys McDonalds once a week. Not everyone will buy a big mac, but since it is a popular item on the menu, let’s assume that a third of people who go to McDonald's buy a big mac. ⅓ of 30 million is 10 million. That means under these assumptions college students buy 10 million big macs per week. This is equivalent to 520 million big macs per year.
With respect to families, let’s assume that two-thirds of the population is in a family that lives under the same household. Two-thirds of 300M is 200M. When a family buys food, they tend to buy for the entire household. Let’s assume the average family buys food from a restaurant once a week and maybe from McDonald's once a month. Let’s assume that of a typical 4 person family, 1 person chooses to eat a big mac. 200M/4 = 50M. Under the family target audience, 50 million people buy a big mac once a month. This is equivalent to 600M big macs per year.
With respect to people who are trying to buy a cheap meal, let’s assume 10% of the population are trying to buy a quick cheap meal once a week. 300M * 10% = 30M. McDonald's is a popular fast food chain with branches nearly everywhere in the US. Let’s assume that of the 30 million people, 50% choose McDonald's. 30M * 50% = 15M. This results in 15 million people buying from McDonald's once a week. Let’s assume that a third of these people buy big macs, resulting in 5M big macs sold per week to the target audience who want a quick and cheap meal. This is equivalent to 260M big macs sold a year.
If you were to add these values together, 520M + 600M + 260M = 1380M big macs sold a year in the US.
At the end after mentioning your final answer, you could also provide more relevant information that could affect the value. For example, McDonald's has a 2 for $5 or $6 sale where the customers can purchase 2 big macs for $5/$6. This would increase sales of big macs. This shows you have knowledge about the product.
Sometimes during interviews, you are given an online board where you can draw and note down your thoughts. This may be a design you can choose to help you understand and explain your thoughts better!
Things to remember
- Remember you do not have to mention all the target audiences. Mention the important ones which take up most of the market and if you want, mention edge cases.
- When you get a final answer, do a sanity check. Check if your final answer is overvalued or undervalued and adjust values accordingly.
- Write down especially when using the layout of types of consumers, so you have a reference of what is connected to what
- Remember you can use any filter that sounds sensible, but do remember to mention how this relates to the actual solution.
- A key point to remember is to not use personal bias. Remember your social circle does not represent the entire population. Do not assume everyone thinks the same as you. Put yourself in other people’s shoes and see how they would view this problem.
Remember theory testing mainly tests to see how you understand the product and the target audience. There are 3 steps to follow to help answer these questions.
- Identify users affected
- Pros vs Cons of the change
- Data to prove/disprove the theory
- Metrics that will change
To help explaining these steps, we’ll use a question asked by Facebook “A PM wants to double the number of ads in Newsfeed, how would you figure out if this is a good idea or not?”
Imaging testing a hypothesis, first you need to do some background research. Understand what types of users will be affected by this change. List out the user groups affected. Some changes in a product will affect all users while some will only mainly affect a portion of the users. Are the users affected the target audience or other users?
With the Facebook example, general users of Facebook who scroll through their News Feed will be affected and influencers will also be affected since there will be more ads instead of their posts.
Pros vs Cons
Every change in product will include positives and negatives. It could either affect the user base or the company’s resources. List 2 pros and 2 cons of how the product will change if the change is implemented. Theory testing usually involves testing a theory about changing a product/service. A pros and cons list would help understand the change better and what data/metric to collect. The reason to identify the users affected is to incorporate on how the user group will be affected in the pros and cons list, since at the end the users are the people who will use the product.
Using the Facebook example:
- Increase in revenue for Facebook
- Businesses may choose to use Facebook as a solution for their marketing techniques due to an increase in ads
- Users might decrease time on app due to increase in ads
- Influencers might not post as much since their posts do not receive the recognition they used to get
Data to support theory
Sometimes a similar change could have been implemented before in the same company or another company. If you know if a similar change has been made, mention how the change has affected the product and what metrics have changed.
If there is no prior data, mention what metrics you think would change if the change is implemented. Remember to mention how you predict the metrics would change and the logic behind why you think this.
Using the Facebook example, we could see how users felt when Facebook first implemented ads into the newsfeed. This would give direct information on how it affected Facebook users. Metrics that could be tracked include Daily Active Users and number of influencers posts. DAU will help to show if users stop using the app after the change. If there is a decrease in the number of influencers posts, their followers will decrease in usage of the app.
At this point, you would have explained the initial part of the general framework and the specific question framework. Now you have to explain the edge cases and summarize your solution.
With any given business case interview question, every solution you give will not cover a certain edge case. The main reason for this could be due to time constraints during the interview. It is good to identify edge cases that your solution has not covered. Even if you can not explain how to solve the edge case, it is better to identify a possible edge case than to not identify it at all. A lot of interviewees will identify the majority of the solution, but to identify edge cases is what would help separate you from others. There is a reason to mention the edge cases after your specific question solution. This helps structure your solution so the interviewers will not get confused.
An example for an edge case can be seen with a Google interview question “How many cans of blue paint were sold in the United States last year?” Supposed 3 potential main target audiences were identified: residential buildings, corporate buildings, and car manufacturers. An edge case could be preschool and elementary schools which would buy paint for students. You as the interviewee could mention some rough estimates on how you would calculate how much blue paint might have been bought.
After providing your detailed solution, you should summarize your solution.
Remember to include:
- Assumptions made about the question
- Data/metrics collected and relevance to the solution
- Overview of specific approach to the solution using the data/metrics collected
The summary of your solution is an important part of your interview. It is important to reiterate the key points in your solution, but also it shows how you will communicate to investors and clients on how to solve these business problems.
Overview of steps to follow
- Clarify the goal of the question - Why are they asking this question - Understand how the product is related to the company’s goals - Trying to increase revenue, market share, or user engagement
- Understand the question - Breakdown keywords
- Specific Framework
- Edge Cases
- Summarize answer
Communicating your Solution
Remember that your final answer is not the deciding factor for a successful interview or not. You must be able to communicate your thought process in a detailed manner. Interviewers want to see how your thought process works and see if you covered all the important points when explaining your solution. An important part of the interview is how well you can communicate your solution.
There should be a clear understanding of the question and thought process between you and the interviewer. The input of the interviewer may be the most important part to include in your solution. If the interviewer gives instructions or asks questions that deviate from the framework, follow it. Never strictly follow the framework. You should refine your solution and the steps to your answer with respect to the interviewer’s comments. Remember the interviewer influences the decision of hiring you, so the better you can incorporate the interviewer’s comments the better your chances are.
Take time before responding
- As the interviewee, you are given some time to think about how to approach your solution. If you respond immediately, you may realize your solution has a major flaw, thus causing you to backtrack to correct the problem. To prevent this and miscommunication between you and the interviewer, take some time before responding. Interviewees usually take up to 30 seconds before responding.
- Don’t worry if it does take longer than 30 seconds to come up with a solution! At this point you could state your assumptions and how you are thinking on how to start your solution. This prevents any awkward silence and does not appear you are taking too much time to provide a solution.
Agree on goals/assumptions
- A key part in communicating with the interviewer is to have an understanding of how to go about your solution. The phrasing behind business case questions are ambiguous on purpose, so the interviewee can identify the phrases that need to be clarified. Similar to an actual discussion with a client, you need to agree upon the goals/assumptions before crafting a solution.
Mention technical term
- Since you are applying to a technical role, you should be using technical terms to separate yourself from others. It shows you have an understanding of where certain technical concepts should be applied.
- Remember you should never force yourself to use a technical term. There are certain times you should avoid using a technical term
- If you do not understand the technical concept entirely
- If the technical concept is an overkill or if there is a simpler solution to the question. Just because the term machine learning sells everywhere does not mean you should use it everywhere.
- There is no relevance between the technical concept and your solution
- Business case interview questions do not always require complex technical concepts, so do not worry if your solution seems straightforward. As long as your solution covers the important parts of the question, you are good.
- Whenever introducing a new metric or collecting data, mention what the metric/data is.
- How to derive/collect the metric/data.
- How does the new metric/data help craft your solution?
Symantec asked “Suppose you have a coffee store, what do you do to increase the number of customers?” Suppose you are trying to increase the number of customers by posting an ad for a holiday sale on one of your most popular products. Mention you should collect a dataset on how many people buy every product. This could be collected through the transactions and what products were sold. Once you have collected the data, you can find which items are sold the most. Depending on the profit margins on the top selling products, you could temporarily reduce the price to increase the number of customers!
- While providing your solution, if you ever get stuck in the middle of your solution, tell your interviewer you are stuck! Tell them how you are thinking about your solution and what you are trying to achieve in the next step. Tell the interviewer certain steps you thought about taking in your solution, but did not due to certain reasons (explain these reasons). Try to see if the interviewer gives any clues.
Mistake in your solution
- If you think your solution is not plausible or will definitely lead to an error, mention why your solution will not work. Take some more time to see if you can change your solution or if you have to backtrack your solution to where you feel most confident. Remember do not worry if you encounter a problem that your solution can not handle. Similar to the work environment, it is better to identify a problem instead of continuing on top of a broken base. When mentioning your new solution, remember to explicitly state how your new solution will avoid the issue.
Always communicate your solution out loud! Talk to your interviewer through your solution, so they can also understand your thought process. Remember to connect each step with the previous step and the overall goal. After a couple of steps, do a sanity check to make sure the interviewer understands your solution.
How to prepare for Business Case questions before the interview
Business case interview questions are another challenging part of the data science interview. These questions are quite difficult to predict due to its diversity and seemingly random questions.
In respect to the 3 categories of Business Case questions: Applied Data, Sizing, and Theory Testing, there is a different way to prepare for each.
When preparing for Applied Data questions, first you must understand the type of business the company does. Is the company involved with a lot of B2B or B2C? Sometimes, with large corporations, such as Google, the company will mention what division you are applying for. Research what types of business this division does. Next you should understand the type of product the company produces, or even better the product the division you have applied for produces. Try to understand the business model around the product and what types of data they would use to help improve the product. If there is no public information on this, try to make your own assumptions! Look at the product from different angles with respect to the target audience and what data you would collect to understand how to improve the product.
Sizing questions are unfortunately more random and sometimes the company will ask a question that may seem unrelated to the company’s business model. Example: Google asked “How many cans of blue paint were sold in the United States last year?” For these types of questions, it requires constant practice with other sizing questions.
Preparing for theory testing is sort of the next step to the applied data question, since you have found out how to apply the data to improve the product. How to test if this would actually work or not? Think about what may occur if this solution is applied. To what extent the users will be affected and how certain metrics will increase, decrease, or remain constant. Think about what type of problems your potential employer faces! These problems could be solved using a data science approach, so think how you would solve the problem!
Try to research the product's competitors and understand if other competing companies have changed their product similarly to how the company you are applying for is trying to mention how it worked out for the competing company! Similar to sizing questions, theory testing requires practice with multiple questions!
Helpful Links for Preparation
- Facebook released a data science interview prep video, which gives the answer to one of the questions in this article!
https://vimeo.com/385283671/ec3432147b → 2:25 - 12:15
- Another helpful video to answer a business case question asked by Spotify
- Mock Product Manager Interview (LinkedIn PM): Improve Spotify's Social Features
- Uber Product Manager Mock Interview: Estimate Drivers in SF
- Google Product Manager Estimation Interview: Paint Market
- Mock Product Manager Interview (Google PM): Estimate Pixel Phone Storage