314 Resources For Data Science Fundamentals

314 Resources For Data Science Fundamentals

A list of best resources for data science fundamentals.

This is a fundamentals guide to data science interviews. In this guide, common topics from coding and non-coding questions asked during data science interviews are covered!

The 7 categories covered are:

  1. Python
  2. SQL
  3. Probability
  4. Statistics
  5. Modeling
  6. Business Case
  7. Product

The common topics asked are classified under their relevant subject. For example, Regression would be classified under the Statistics category. These common topics include YouTube videos or articles explaining the topic. Most topics also include interview questions from companies such as Google and Uber for practice.

While this guide provides common topics asked during data science interviews, this fundamentals guide is meant as a base page. You should further research on the topics mentioned.

  • For coding questions, it is all about practice. Practicing and understanding your own and other people’s code. Some people may have a solution that is more optimized than yours, which is why building a community is important. StrataScratch shows how other people have solved a question you attempted, where you can also learn more about how to better your solution!
  • For non-coding questions, you need to understand each step of applying a concept. The understanding of mathematical derivations behind a concept is important to truly answer any question about the relevant concept. Once you understand the math behind a concept, you need to learn how to effectively program these concepts. This comes back to practice and learning from others. It is important to understand the mathematical derivation of concepts. While only a handful of interview questions will ask mathematical knowledge behind certain topics, such as logistic regression, it is good to know when implementing complex models during your job as a future data scientist!

There are specific topics that are only asked by a handful of companies. To get a better idea of the type of potential topics companies may ask, you should go through the job requirements. However, some companies do not write their Data Science job descriptions when hiring. They may just copy descriptions from Glassdoor or might ask for someone who knows every single programming language and statistical models out there. When you notice job descriptions similar to this, you should be good with going through the topics mentioned in this guide.

It must be noted that not all topics include a practice question. Even if a topic does not include a practice question, you should understand the concept, since this may be asked during your interview or future data science job!

If you do not understand a topic, check out the StrataScratch blogs! There are multiple guides written to help interviewers prepare from SQL Window Functions to Collinearity. If there are topics that are not explained in the StrataScratch blogs, visit other websites to learn more about the topic! Take notes about these topics and go through more practice questions! It is recommended to have a document of topics you don’t fully understand or easily forget.

For example, if you easily forget how the ranking() function works in SQL take screenshots and notes to explain the concept.

Data Science Resources for Interviews

Do keep in mind that medium/hard questions on the StrataScratch database will contain multiple concepts, since during actual data science interviews you will need to know a variety of functions/topics.

When a question asks you to explain a certain concept to a specific target audience, think in the eyes of the target audience when answering these types of questions. Remember to not use technical jargon that would confuse the audience, especially a non-technical audience. For example, this question asks how to explain regression to an 8 year old. Obviously an 8 year old would not understand MSE values or even a simple y=mx+b equation, so explain using everyday jargon.

Coding Questions

Resources For Data Science Coding Questions


Everyone who knows how to code nowadays has heard of Python. There is a reason Python is extremely popular among Data Analysts and Data Scientists. It has a variety of libraries and the ability to process data quickly. NumPy and Pandas are some of the most commonly used libraries for data analysis.

  1. Understanding NumPy / Pandas
    1. Numpy
      1. NumPy: the absolute basics for beginners
      2. NumPy Reference & Cheat Sheet
    2. Pandas
      1. User Guide
  2. Comparison Operators / Logical Operators / Mathematical Functions
    1. Operators and Expressions in Python
  3. Flow Control Functions
    1. More Control Flow Tools
    2. Errors and Exceptions
  4. Conditional Expressions
    1. Filtering by columns/rows
      1. 10 Ways to Filter Pandas Dataframes
    2. Unique values - nunique() / drop duplicates
      1. How to Use Pandas Unique to Get Unique Values
      2. Python | Pandas dataframe.drop_duplicates()
      3. Get Pandas Unique Values in Column and Sort Them
    3. Null values
      1. Working with missing data
    4. Casting data types
      1. Change the data type of a column or a Pandas Series
      2. Type Conversion in Python
    5. Rounding values
      1. How to Round Numbers in Python
    6. Sorting
      1. Pandas Sort: Your Guide to Sorting Data in Python

        Practice Questions
        1. Accommodates-To-Bed Ratio
        2. Find the number of unique properties
        3. 3rd Most Reported Health Issues
  5. DataFrame Formatting
    1. Grouping
      1. Learning
        1. Pandas GroupBy: Your Guide to Grouping Data in Python
        2. Largest Olympics
      2. Practice
        1. Reviews of Hotel Arena
        2. Year Over Year Churn
    2. Merges
      1. Learning
        1. All the Pandas merge() you should know for combining datasets
      2. Practice
        1. Gender With Generous Reviews
        2. Distances Traveled
        3. Premium vs Freemium
  6. Ranking Methods
    1. Learning
      1. pandas.DataFrame.rank
    2. Practice
      1. Most Profitable Companies
      2. Rank Variance Per Country
      3. Consecutive Days
  7. Dict Methods
    1. Learning
      1. Python Dictionary(Dict): Update, Cmp, Len, Sort, Copy, Items, str Example
    2. Practice
      1. Player with Longest Streak
  8. Array Operations
    1. Learning
      1. Python Arrays
    2. Practice
      1. Ranking Most Active Guests
  9. String Functions
    1. Learning
      1. pandas.Series.str.contains
      2. Python String Methods
    2. Practice
      1. 'BAKERY' Owned Facilities
      2. Find the number of unique properties
      3. Find all wineries which produce wines by possessing aromas of plum, cherry, rose, or hazelnut
  10. List Methods
    1. Learning
      1. Python List Functions & Methods
      2. Data Structures
  11. Lambda function
    1. Learning
      1. How to Use Python Lambda Functions
    2. Practice
      1. Find the date with the highest opening stock price
      2. Reviews of Categories
      3. Find the genre of the person with the most number of oscar winnings
  12. Class methods/Set Methods
    1. Python's Instance, Class, and Static Methods Demystified


Even though Python and Pandas has the ability to process databases, Pandas takes much longer and sometimes can not handle larger databases. In these cases, Structured Query Language (SQL) is highly preferred. SQL has simpler code while having the ability to filter and restructure databases through queries.

In the following sections, there will be multiple links to learning and practicing specific concepts, a free one-stop for SQL tutorial is https://mode.com/sql-tutorial. However, it is recommended to go through YouTube SQL tutorials as well, especially if it is your first time learning, so you can learn tips and tricks when coding.


  1. Where / Sorting / Having
    1. Learning
      1. SQL WHERE Clause
      2. SQL AND, OR and NOT Operators
      3. SQL ORDER BY Keyword
      4. SQL HAVING Clause
    2. Practice
      1. Employees With Bonuses
      2. Acceptance Rate By Date
      3. Accommodates-To-Bed Ratio
      4. Highest Cost Orders
  2. Limit / Offset
    1. Learning
      1. SQL: SELECT LIMIT Statement
    2. Practice
      1. Find the most common grade earned by bakeries
  3. Date Time Functions
    1. Learning
      1. SQL - Date Functions
      2. A Walkthrough of Data Science SQL Interview Question from Noom (Date Manipulations)
      3. Common Date Manipulations on Data Science SQL Interviews
    2. Practice
      1. Customer Revenue In March
      2. Growth of Airbnb
  4. Distinct Clause
    1. Learning
      1. SQL SELECT DISTINCT Statement
    2. Practice
      1. Number Of Unique Facilities And Inspections Per Municipality
      2. Customer Orders and Details
  5. Aggregate Functions (Group By, Case)
    1. Learning
      1. SQL GROUP BY Statement
      2. SQL Case Statements For Data Science Interviews in 2021
    2. Practice
      1. Find the postal code which has the highest average inspection score
      2. Host Popularity Rental Prices
  6. Combining (Joins / Unions)
    1. Learning
      1. SQL Joins
      2. SQL INNER JOIN Keyword
      3. SQL LEFT JOIN Keyword
      4. SQL RIGHT JOIN Keyword
      5. SQL FULL OUTER JOIN Keyword
      6. SQL Self Join
      7. SQL CROSS JOIN with examples
      8. SQL UNION Operator
      9. SQL Joins Tutorial for Beginners
    2. Practice
      1. Total Cost Of Orders
      2. Highest Energy Consumption
      3. Highest Cost Orders
  7. Subquery Expressions
    1. Learning
      1. SQL Server Subquery
    2. Practice
      1. Highest Priced Wine In The US
      2. Inspection Scores For Businesses
  8. Window Functions (Partition by, Rank, Ntile, Lag/Lead, Common Table Expression)
    1. Learning
      1. SQL Window Functions on Data Science Interviews in 2021 | Asked By Airbnb, Netflix, Twitter, Uber
      2. SQL Coding Interview Question Using A Window Function (PARTITION BY) | Data Science Interviews
      3. Multiple Solutions to Data Scientist Interview Question From Amazon [Rolling Average]
      4. What a Moving Average Is and How to Compute it in SQL
      5. Common Table Expressions – The Ultimate Guide
    2. Practice
      1. Highest Total Miles
      2. Top Percentile Fraud
      3. Marketing Campaign Success [Advanced]
  9. Pattern Matching / Text Searching
    1. Learning
      1. LIKE and ILIKE for Pattern Matching in PostgreSQL
    2. Practice
      1. Classify Business Type
      2. Counting Instances in Text
  10. Array Functions
    1. Learning
      1. Working with arrays
    2. Practice
      1. Reviews of Categories
      2. City With Most Amenities
      3. Views Per Keyword

Non-Coding Questions

Resources For Data Science non coding questions


Understanding mathematical concepts is an important part of being a successful data scientist. Probability provides an important foundation for concepts such as Bayes Theorem and distributions.

  1. Axioms of Probability
    1. Learning
      1. Axioms of Probability
  2. Permutations/Combinations
    1. Learning
      1. How to Use Permutations and Combinations
      2. Permutations, Combinations & Probability (14 Word Problems)
    2. Practice
      1. Pair by Drawing 2 Cards
      2. HHT Probability
  3. Multiplication Rule
    1. Learning
      1. Multiplication & Addition Rule - Probability - Mutually Exclusive & Independent Events
  4. Conditional Probability
    1. Learning
      1. Conditional Probability With Venn Diagrams & Contingency Tables
    2. Practice
      1. 3 Heads Probability
  5. Independent Events
    1. Learning
      1. Independent Events (Basics of Probability: Independence of Two Events)
    2. Practice
      1. Even Heads
  6. Bayes Theorem
    1. Learning
      1. Bayes' Theorem and Cancer Screening
    2. Practice
      1. Two Boys Odds
  7. Different distributions
    1. Probability Density Function and Cumulative Density Function
      1. Probability Density Functions - PDF
      2. Cumulative Distribution Functions - CMF
      3. Finding Percentiles
      4. Special Expectations

    2. Normal Distribution (aka Gaussian Distribution)
      1. Learning
        1. Normal Distribution
        2. Normal Distribution & Probability Problems
      2. Practice
        1. Non-Gaussian Distribution
        2. Expectation Of A Gaussian
    3. Uniform Distribution
      1. Learning
        1. Gallery of Distributions
        2. Continuous Probability Uniform Distribution Problems
      2. Practice
        1. Larger Expected Value
    4. t Distribution (aka Student’s t distribution)
      1. Learning
        1. t Distribution
        2. Student's T Distribution - Confidence Intervals & Margin of Error
    5. F Distribution
      1. Learning
        1. F Distribution
        2. Lesson 1 - What is the F-Distribution in Statistics?
    6. Chi-Squared Distribution
      1. Learning
        1. Chi-Square Distribution
        2. Chi Square Test
    7. Exponential Distribution
      1. Learning
        1. Exponential Distribution
        2. Probability Exponential Distribution Problems
    8. Lognormal Distribution
      1. Learning
        1. Lognormal Distribution

    9. Binomial Distribution
      1. Binomial Distribution
      2. Binomial distributions | Probabilities of probabilities, part 1
      3. Finding The Probability of a Binomial Distribution Plus Mean & Standard Deviation
    10. Poisson Distribution
      1. Poisson Distribution
      2. Poisson Distribution EXPLAINED!
  8. Series: Geometric - Hypergeometric - Arithmetic - Summation to Infinity
    1. Learning
      1. Geometric Distribution - Probability, Mean, Variance, & Standard Deviation
      2. The Hypergeometric Distribution: An Introduction (fast version)
    2. Practice
      1. Matching Pairs Attempts
  9. Expected Value
    1. Learning
      1. Expected Value and Variance of Discrete Random Variables
    2. Practice
      1. Roulette Expectations
      2. Expectation Of Sum Of Dices
  10. Binomial Distribution - Negative Binomial
    1. Learning
      1. An Introduction to the Binomial Distribution
    2. Practice
      1. Throwing Dice
      2. What is More Likely


Data Science can be summed up as Computational Statistics. From predicting what shows are recommended to you on Netflix (Collaborative filtering) to predicting the demand of iPhones next year (Regression), statistics is the basis of Data Scientists.

  1. Intro to Stats (Mean, Median, Mode, Range, Standard Deviation, Graphs)
    1. Learning
      1. Mode, Median, Mean, Range, and Standard Deviation (1.3)
      2. Bar Charts, Pie Charts, Histograms, Stemplots, Timeplots (1.2)
    2. Practice
      1. Mean, Median Age in Mexico
      2. Anomaly in Distribution
      3. Box Plot and Histogram
  2. Boxplot - IQR
    1. Learning
      1. How To Make Box and Whisker Plots
    2. Practice
      1. New Observation is Outlier
  3. Variance → ANOVA
    1. Learning
      1. Variance - How To Calculate Variance
      2. ANOVA - How To Calculate and Understand Analysis of Variance (ANOVA) F Test.
    2. Practice
      1. Expectation of Variance
      2. Variance in Unsupervised Model
  4. Z-test --- T-test
    1. Learning
      1. Z-statistics vs. T-statistics | Inferential statistics | Probability and Statistics | Khan Academy
      2. https://www.ztable.net/wp-content/uploads/2018/11/negativeztable.png
      3. http://www.ttable.org/uploads/2/1/7/9/21795380/published/9754276.png
      4. Hypothesis Testing Problems Z Test & T Statistics One & Two Tailed Tests 2
    2. Practice
      1. Sample Size
  5. Central Limit Theorem
    1. Learning
      1. Introduction to the Central Limit Theorem
  6. Confidence Interval
    1. Learning
      1. Confidence intervals and margin of error | AP Statistics | Khan Academy
      2. How To Find The Z Score, Confidence Interval, and Margin of Error for a Population Mean
      3. Confidence Intervals
    2. Practice
      1. Confidence Interval
      2. Margin of Error
  7. Hypothesis testing -- P-Value
    1. Learning
      1. Hypothesis testing and p-values | Inferential statistics | Probability and Statistics | Khan Academy
      2. Hypothesis Testing
    2. Practice
      1. P-value
      2. Interpret P-value
  8. Confusion matrix (Sensitivity and specificity)
    1. Learning
      1. Machine Learning Fundamentals: The Confusion Matrix
      2. Machine Learning Fundamentals: Sensitivity and Specificity
      3. Overview of confusion matrix contingency table Differences in nomenclature for machine - jpg
    2. Practice
      1. Precision and Recall
      2. False Positives or False Negatives
  9. A/B testing
    1. Learning
      1. What is A/B Testing? | Data Science in Minutes
      2. Cracking A/B Testing Problems in Data Science Interviews | Product Sense | Case Interview
      3. A/B Testing Guide
    2. Practice
      1. Certain Factor Predicts Certain Outcome
      2. Random Bucketing
  10. Polar Coordinates
    1. Learning
      1. Polar Coordinates Basic Introduction, Conversion to Rectangular, How to Plot Points, Negative R Valu
      2. An Introduction to Polar Coordinates
    2. Practice
      1. Circle in Polar Coordinates
  11. Correlation coefficient (aka Pearson's correlation coefficient)
    1. Learning
      1. Correlation Coefficient
      2. Correlation and regression
    2. Practice
      1. Pearson's Correlation Coefficient
  12. Bias-Variance Tradeoff
    1. Learning
      1. Machine Learning Fundamentals: Bias and Variance
      2. Understanding Bias-Variance Tradeoff
      3. https://cdn.analyticsvidhya.com/wp-content/uploads/2020/08/Copy-of-Add-a-subheading5.png
    2. Practice
      1. Bias-Variance Tradeoff
  13. Error Predictions (MSE, RMSE, MAE, R^2)
    1. Learning
      1. Root Mean Square Error (RMSE) Tutorial + MAE + MSE + MAPE+ MPE | By Dr. Ry @Stemplicity
      2. Regression Model Accuracy (MAE, MSE, RMSE, R-squared) Check in R
      3. Adjusted R squared vs. R Squared For Beginners | By Dr. Ry @Stemplicity
        1. When is R squared negative?
    2. Practice
      1. StrataScratch Data Science Questions - R^2 Value
      2. StrataScratch Data Science Questions - Negative R Squared
  14. Regression
    1. Learning
      1. OLS - Introduction to residuals and least squares regression
      2. Ridge - Regularization Part 1: Ridge (L2) Regression
      3. Lasso - Regularization Part 2: Lasso (L1) Regression
      4. Elastic-Net - Regularization Part 3: Elastic Net Regression
      5. Logistic Regression - StatQuest: Logistic Regression
      6. Regularization: Ridge, Lasso and Elastic Net
      7. Logistic vs Bayesian Logistic
    2. Practice
      1. OLS Assumptions
  15. F-statistic
    1. Learning
      1. The F statistic - an introduction
      2. F test - example 1
      3. F Statistic / F Value: Simple Definition and Interpretation


Modeling is the application of statistical concepts and frameworks in everyday scenarios. Before attempting these questions make sure you have a thorough understanding of the concepts in the Statistics section.

Structuring Data

  1. Overfitting/Underfitting
    1. Learning
      1. Understanding the Bias-Variance Tradeoff
    2. Practice
      1. Regularization Reduces Overfitting
      2. Overfitting While Training Model
  2. Cleaning up data
    1. One-hot/Label/Ordinal encoding
      1. Learning: Categorical encoding using Label-Encoding and One-Hot-Encoder
      2. Practice: Encode Categorical Features
    2. Missing values
      1. Learning - 7 Ways to Handle Missing Values in Machine Learning
      2. Practice - Scikit-learn Models Missing Values
  3. Diagnostic Tests (Outliers, Collinearity, Normality, Autocorrelation, Linearity, Homoscedasticity/Heteroscedasticity, Stochastic)
    1. Outliers
      1. Box-plot - How To Make Box and Whisker Plots
      2. Grubbs Test - Grubbs Test (example)
    2. Randomness
      1. Chi-Squared - Chi Square Test
    3. Collinearity
      1. Features Correlation
      2. Colinearity in Data Analysis
      3. Multicollinearity in Regression
      4. Collinearity vs Multicollinearity
    4. Normality
      1. Shapiro-Wilk - Shapiro-Wilk test
      2. Kolmogorow-Smirnov test - 10: Kolmogorov-Smirnov test
      3. What is the difference between the Shapiro-Wilk test of normality and the Kolmogorov-Smirnov test of normality?
    5. Autocorrelation
      1. Durbin-Watson test - Serial correlation - The Durbin-Watson test
    6. Homoscedasticity / Heteroskedasticity
      1. Homoscedasticity - Bartlett's test
      2. Heteroskedasticity - Breusch–Pagan test - The Breusch Pagan test for heteroscedasticity

        Note: While these are common tests that are usually used in Data Science interviews, there are a lot more tests and more types of diagnostic tests than mentioned above.
  4. Transformations (bptest, box-cox transformation, SIFT)
    1. SIFT
      1. Learning - Introduction to SIFT( Scale Invariant Feature Transform)
      2. Practice - SIFT
    2. Box-Cox transformation
      1. Learning
        1. Box-Cox Transformation, Explained
        2. Box-Cox Transformation + R Demo
    3. Log transformation
      1. Learning - Log Transformation: Purpose and Interpretation
  5. Testing and Training Data Split + Cross-validation
    1. Learning
      1. Training and Test Sets: Splitting Data
      2. k-fold cross-validation explained in plain English
    2. Practice
      1. Model Evaluation Procedures

Modeling Data

  1. Supervised vs Unsupervised Learning
    1. Learning
      1. Supervised vs Unsupervised Learning
    2. Practice
      1. Supervised and Unsupervised Machine Learning
      2. Preprocessing Step for Supervised Learning
  2. Feature Extraction
    1. PCA
      1. Learning
        1. An Introduction to Principal Component Analysis (PCA) with 2018 World Soccer Players Data
        2. A Step-by-Step Explanation of Principal Component Analysis (PCA)
        3. Dimensional Reduction | Principal Component Analysis
      2. Practice
        1. PCA
    2. Factor Analysis
      1. Learning
        1. Factor Analysis - an introduction
        2. Factor Analysis - model representation
        3. Factor Analysis - model representation - part 2
      2. Practice
        1. PCA and FA
    3. Linear/Quadratic Discriminant Analysis
      1. Learning
        1. StatQuest: Linear Discriminant Analysis (LDA) clearly explained.
        2. Linear and Quadratic Discriminant Analysis
      2. Practice
        1. PCA and LDA/QDA
  3. Feature Selection
    1. AIC
      1. Learning
        1. An introduction to the Akaike information criterion
        2. How to Calculate AIC of Regression Models in Python
    2. BIC
      1. Learning
        1. How to Calculate BIC in Python
        2. Is there any reason to prefer the AIC or BIC over the other?
    3. Mallow’s Cp
      1. Learning - What is Mallows’ Cp? (Defintion & Example)
    4. Adjusted R^2
      1. Learning - Multiple Regression Analysis: Use Adjusted R-Squared and Predicted R-Squared to Include the Correct Number of Variables
    5. Stepwise Regression/Forward Selection/Backward Elimination
      1. Learning - Feature Selection Techniques in Regression Model
  4. Regression -- [Explained in statistics]
    1. Practice
      1. Regression Definition
      2. GBM or Logistic Regression
      3. Logistic Regression and Linear Regression
      4. Assumption of Linear Regression
      5. Logistic Regression and Odds Ratio
      6. Changing the Scale of Distance
  5. Forecasting
    1. Learning - An overview of time series forecasting models
    2. Practice - Time Series Forecasting Techniques
  6. Ensembles
    1. Boosting (XGboost)
      1. Learning
        1. Boosting Algorithms Explained
        2. Xgboost Classification Indepth Maths Intuition- Machine Learning Algorithms
      2. Practice
        1. Boosting Work
        2. Gradient Boosted Models
    2. Random Forests
      1. Learning
        1. Understanding Random Forest
        2. Random Forest Algorithm Clearly Explained!
      2. Practice
        1. Random Forest
        2. Random Forest Estimator
    3. Decision Trees
      1. Learning
        1. Intuitive Guide to Understanding Decision Trees
        2. StatQuest: Decision Trees
      2. Practice
        1. Decision Trees and Extreme Boosted Trees
  7. K-means clustering
    1. Learning
      1. StatQuest: K-means clustering
    2. Practice
      1. K-means Algorithm
      2. In-Group Errors
  8. Naive Bayes Classifier
    1. Learning
      1. Naive Bayes, Clearly Explained
    2. Practice
      1. Naive Bayes Classifier
      2. Popular Naive Bayes Model

Neural Networks

  1. SVM
    1. Learning
      1. Support Vector Machine (Detailed Explanation)
      2. Support Vector Machines Part 1 (of 3): Main Ideas
      3. Support Vector Machines Part 2: The Polynomial Kernel (Part 2 of 3)
      4. Support Vector Machines Part 3: The Radial (RBF) Kernel (Part 3 of 3)
    2. Practice
      1. Margin Classifier and Hyperplane
  2. Gradient Descent + SGD
    1. Learning
      1. Gradient Descent, Step-by-Step
      2. Stochastic Gradient Descent, Clearly Explained
    2. Practice
      1. Gradient Descent and Stochastic Gradient Descent
      2. Gradient Descent
  3. Neural Networks (CNN/RNN)
    1. Learning
      1. Convolutional Neural Networks (CNNs) explained
      2. Illustrated Guide to Recurrent Neural Networks: Understanding the Intuition
    2. Practice
      1. Perform KNN
      2. Neural Network

Not all modeling questions will explicitly mention a statistical concept to use as part of your answer. Interviewers test your understanding of the questions and what techniques you will use to solve the question. Remember that interviewers don’t always look for the most accurate solution, but want to see you have a solid understanding about the question and how to go about solving it. Remember before attempting these questions during an actual interview, ask clarifying questions such as ambiguous terminology to the interviewer.


The following questions test your understanding of when to use which statistical analysis and your overall approach to an everyday problem. For example, a modeling question asked by Amazon was to predict whether a customer will buy something today or not based on their information. Definitely practice these before your interviews!

Business Case

Business case questions are a tricker type of questions. These questions can not be split into basic topics to learn. These questions are split into 3 topics: applied data, sizing, theory testing. These questions mainly test your understanding of the company’s products, economy and business competitions.

These are the types of questions to practice multiple times. If you want to get even better, research the company’s products before the interview, so you can show the company you’re interested in them.

Fortunately, we have written a guide to solving data science business case questions to learn more about how to improve on answering business questions.


Product questions, similar to business case questions, can not be split into topics to learn from scratch. These questions are split into 3 topics: metric related problems, measuring impact of a new product/feature, and designing products. These are questions to practice repeatedly. Fortunately, we have written an ultimate guide to solving data science product questions. To learn more about how to improve on solving product questions, check out the ultimate guide to product data science interview questions.

314 Resources For Data Science Fundamentals

Become a data expert. Subscribe to our newsletter.