4 Resources To Prepare For Data Science Interviews
I've been in the field of data science for about 10 years. I've been a data scientist as well as all the other flavors of a data scientist like business analytics, growth analytics, marketing analytics, marketing science. These days though I'm more aligned to product, but I still interview data scientists regularly.
As a data scientist and somebody that interviews data scientists, I know how to prepare for the interview itself and I know a lot about the topics that will be covered during the interview itself.
Here's the video in case you refer to watch the content instead of read.
Data Science Interview Tips
One of my most important data science interview tips is brushing up on the right skillset using the right resources. In this blog, we will learn the different types of topics covered during an interview, things to consider before choosing a learning resource, and the best online resources for data science interviews.
Format of a Data Science Interview
The first thing and one of the data science interview tips is to understand the different types of topics that will be covered during the interview.
There are typically several different topics that will be covered during a data science interview. The recruiter should tell you what topics will be covered in that interview and what that format of the interview will look like.
For example, your first technical interview might be over video conference. So, you'll be sharing a screen with the interviewer where you can write and show your code to them. Then your second interview might be in person where you'll be meeting several data scientists and you're going through four to five different interviewers and covering four to five different topics. It's important to understand what all of those topics are, so you can prepare sufficiently to succeed in those interviews.
Four Topics Covered During Data Science Interview
For the most part, there are four topics that are covered in the data science interview.
- Product Sense / Business Cases
- Statistics and Probability
- Modeling Techniques
The first one is coding. Then there's product sense or business cases. The third is statistics that can also include probability and can include base theorem as well. Then the final topic could be modeling where they'll cover models like the Random forest, Gradient boosting, K-means clustering, and several other common models.
Two Things to Consider in Choosing a Learning Resource
Now that we've introduced these four topics and know what will be covered in a data science interview, it's time to finally prepare for that interview.
There are two things I look for in a resource:
- Lots of Practice Problems
- Real Questions from Real Interviews
I want to practice as many problems as I can to reinforce all the concepts that will be covered in an interview. And the second thing is practice problems and interview questions that are very relevant for that interview and for that company. That's what I'll be looking for. Real questions that have been asked at interviews in the past, that's what I'm aiming to get.
Four Online Resources to Prepare for Your Data Science Interview
Here are four online resources that I've used in the past to prepare for data science interviews. And in the end, I'll introduce a bonus fifth one to you all.
1. Glassdoor as a Learning Platform
The first online resource is Glassdoor. It's one of my favorite platforms to use when I'm preparing for a data science interview.
The above image is of Glassdoor. I typed in Facebook, I selected interviews from the menu. And if we scroll down to the middle of the page, we see interview questions for different positions at Facebook. As we scroll down we see positions like data scientist, front end engineer, iOS developer, software engineer, etc. But you can also find a lot of data science interview questions. And what I like about this is that these questions are real questions that were asked at the interview specifically at a data science interview at Facebook. They're as real as they can get.
The second thing I really like about this is that these questions cover all the topics that we discussed above. Another thing that I do is if I'm preparing for a data science interview at Facebook in addition to typing in Facebook and looking at all of the questions there, I will search for other companies in the same industry as Facebook. If I care more about social media companies I might type in Snapchat or TikTok. If I want to branch out a little bit I might just type in other companies in the tech field. So Google would be another one, LinkedIn could be another one. So there are just a lot of interview questions and answers that you can get from Glassdoor.
2. Brilliant.org For Statistics & Probability
Now let's cover statistics and probability. I've used a website called brilliant.org. It's actually a math website. They also cover other types of topics like computer programming, computer science, quantitative finance for example. But I use this to brush up on my statistics and my probability. This was also a website that Facebook recruiters recommended to get additional practice on statistics and probability.
On this website, I go to the practice part from the above menu. There's a lot to choose from. But what I care about most would be probability selection as shown in the below image.
If we go to the probability section, we have the fundamentals and casino probability as well. Basically, I want to try as much as I can. And when I more or less dive into a lot of these practice problems, I'll know whether or not they'll be part of the data science interview.
On the Brilliant website, there are a few sections that I would recommend - Probability, Random variables, statistical testing, and distributions.
A good way to cross check whether or not you should be doing these questions is to go on Glassdoor first. Read some of the statistical questions there and then see whether or not those concepts are found on Brilliant.
This is what I would recommend. This is the online resource to improve your statistics and probability.
3. Multiple Resources To Learn Modeling Techniques
Now, we'll check online resources for modeling specifically machine learning models. I don't necessarily have one place I go to to learn about machine learning models or brush up on those concepts.
On interviews, I've seen modeling questions come up in two ways:
- Application of Models on Projects
The first way is the theoretical questions. The second way is when you talk about your projects. They indirectly or directly ask you about the models that you used and implemented on those projects.
You could get questions on specific models, such as random forest, gradient boosting, k-means clustering, and get questions on the models itself.
You could get the questions like:
Why would you want to use this model?
Why wouldn't you want to use this model?
How to implement this model in code?
How to read the results of this model interpret the results?
These are most of the theoretical questions that you would get on modeling.
2. Application of Models on Projects
You could be talking about a project that used one of these models. An interviewer can just dig deeper and deeper into the project itself and into the model itself and ask you questions about why you picked that model, or what were some of the assumptions that you had while you were developing the model.
The point is, you need to understand the theory about machine learning models. You don't need to know all of the machine learning models, but just the common ones.
A few online resources that I use:
I tend to read blogs a lot. And 'Towards Data Science' is a popular website that I go to. They have a lot of blog articles on machine learning models.
In addition to these blog articles that I read to brush up on my machine learning theory and understanding, I do go to YouTube to watch as many videos as I can. There are a few channels that I would recommend like 'Simplilearn' and a new guy 'Data Professor'. They talk about machine learning models a lot, the theory, and about the application and that's important.
In addition to following some of these channels, I also search machine learning models and get as many videos as I can.
4. LeetCode for the Coding Portion of Interview
For the coding portion of the interview, the online platform that I used to use is called the 'LeetCode'. And this is essentially a platform that was developed for computer scientists and software developers to prepare for their interviews. But they have a nice little database section where they have questions where you can practice your SQL.
If you click on any topic, you will get a practice question where you can type in SQL code and can execute that code and it will actually output something. It's a fully-fledged workspace and an IDE to practice MySQL query.
What I like about this platform is that there are hundreds and hundreds of practice SQL problems. I can get really good at just improving my SQL and coding skills for the interview itself.
The one downside I would say to LeetCode is again, it is and was tailored for software developers. A lot of questions on this platform just help you get better at SQL, but they're not asking data science type of questions that deal with data. So, then I would go to Glassdoor, try to find some coding questions for the company that I'm interviewing for then understand what data science concepts are being covered in those coding questions. And then if I just wanted the practice SQL, I would go to LeetCode and just try to answer as many questions as possible.
5. BONUS - StrataScratch to Prepare for Data Science Interviews
Here's the fifth online resource that I promised you at the beginning: this is 'StrataScratch'. And it's a platform that I designed and built to achieve one thing; to help data scientists prepare for their interviews.
This is a platform that was built for data scientists. And it combines all the best parts of those four online resources that I was talking about earlier, into one platform.
On StrataScratch, we have coding questions and you can select the questions based off of the company. You can also select the questions based off of whether or not you're more comfortable with SQL or with Python. And then if you do want to click into one of them, you have the questions you have the hint, you can see solutions as well as the solutions from other users.
These questions are real questions that come from data science interviews. You can be sure that what you're doing maps directly to a data science interview.
Now, if you go to non-coding questions, you have technical questions of different question types.
You have probability, business cases, product sense type of questions, modeling questions, statistics, miscellaneous technical questions, and system design, and a lot of other concepts that would be tested during a data science interview.
If you click one of these questions, you get the question, an editor to be able to comment and provide your solutions and then you get to see other users and their solutions.
Across the coding and non-coding questions, there are actually over a thousand interview questions on StrataScratch that are taken from real companies for you to practice on.
These are the five online platforms to help you prepare for your data science interview. I hope this is helpful.