How To Use LeetCode For Data Science SQL Interviews
Is LeetCode built for data science interview preparation? Let’s find out!
LeetCode is the de facto and primary educational platform for people preparing for software development and software engineer (SWE) interviews, as well as for people looking to upskill their software development skills in general. According to their site, in two short years after launch, they reached over 1 million users on their platform. LeetCode has all the coding questions you could want to prepare for your software engineering interviews and includes over 1050 questions that test for different technical concepts like data structures and search algorithms. So, is LeetCode designed to prepare for data science interviews? Let’s find out.
Is LeetCode Designed To Prepare You For Data Science Interviews?
As an aspiring data scientist, you might have a question: Is LeetCode for data science? As great as LeetCode is to help software engineers get jobs, LeetCode was NOT designed to help prepare data scientists for their data science interviews or improve their analytical skills. While both jobs require programming skills, how the skills are implemented in the industry are different. Data scientists are required to clean and manipulate data, and churn out insights from the data that would lead to product decisions and recommendations. Manipulating the data to create metrics, understanding those tradeoffs, building statistical and machine learning models are what data scientists do on the job and what is being tested on data science interviews.
LeetCode focuses more on your technical algorithm skills to perform specific tasks like inverting a binary tree or writing sorting algorithms in efficient ways. How to implement technical concepts effectively is the focus on software engineering interviews. How to manipulate and understand data is the focus of data science interviews. In addition, there are non-coding concepts data scientists need to understand and master to become a data scientist and succeed in interviews. These include understanding machine learning theory (and implementing models), statistics, probability, business cases, and product sense just to name a few. These are not the core focus of LeetCode and forces the user to turn to other platforms and resources to master these concepts.
In the past, I’ve used LeetCode to prepare for my data science interviews but I had to temper my expectations and use it mainly for brushing up on my SQL and python skills. In reality, I had to stitch together multiple resources to prepare for my data science interview. It took me several years and multiple job changes to find the best resources to prepare for data science interviews. I’ll focus this article on how I used LeetCode to help me prepare for my interviews but I’ll also recommend several other resources that focus on the other specific aspects of data science interviews like machine learning theory, statistics, and product sense to name a few.
LeetCode Data Science SQL Interview
In a data science coding interview, you’re asked to manipulate data and communicate insights from the data, so there’s often a coding portion that tests your ability to do so. You can usually use any coding language you’re comfortable with but the most popular ones are SQL, python, and R.
LeetCode’s questions are more tailored towards algorithms where you’re tasked to search through arrays and return specific values. How efficiently you do that and the accuracy of your solution are what’s being evaluated.
While LeetCode has 1800 of these algorithm questions, they only have around 150 SQL questions. However, these questions are mainly created for software developers and do not focus on data insights like calculating month-over-month growth or user attribution scenarios that are important to data scientists. Rather, LeetCode questions focus more on performing specific technical tasks like joining tables together or finding the 2nd highest value in a dataset. Here are two questions to illustrate what I mean:
- Write a SQL query to get the second highest salary from the Employee table. (https://leetcode.com/problems/second-highest-salary/)
- Write a SQL query for a report that provides the following information for each person in the Person table, regardless if there is an address for each of those people (https://leetcode.com/problems/combine-two-tables/)
These types of questions on LeetCode are perfect for refreshing your SQL skills but they’re not going to be enough for a data science interview. As a data scientist, you are expected to know how to manipulate the data already but it’s your understanding of the coding tradeoffs and insights from the data that are most important.
Using LeetCode To Practice SQL Questions
You can use LeetCode to refresh your SQL and Python skills, as well as benchmark your skills against others on the platform. I would recommend using LeetCode at the very start of the interview process to refresh your coding skills. Even if you do code every day and are comfortable coding, solving these LeetCode questions will help you understand which technical concepts to focus on and prepare for on interviews.
Here are places on LeetCode where you can practice your SQL coding.
LeetCode’s Database Section For Data Science
On the main page of Leetcode where you can explore the problems, you’ll see two areas that contain SQL questions.
Main Questions Table
To access the database questions, follow the link, https://leetcode.com/problemset/database/, and the table of questions will refresh to show only SQL type questions as shown below.
This is the primary way to practice SQL questions on LeetCode. Most of the LeetCode questions are free but you can really only filter the questions based on difficulty and whether or not the question has a solution. Typically, on LeetCode, you should be able to filter questions based on topics (i.e., technical concepts) but these topics are reserved for algorithm type questions and don’t include database type questions.
It would be nice to be able to filter questions based on which companies have asked these questions but it’s reserved for premium members. If you’re interviewing for a data science role, it would be worth the upgrade to premium on LeetCode.
Question Editor Interface
Once you click into a question on LeetCode, you’re given the question prompt and table schema so that you can start coding. It’s really helpful to be given a table schema with data types as this would be given to you on interviews. LeetCode has three SQL dialects -- MySQL, MS SQL Server, and Oracle. These are three very popular SQL dialects but it would have been nice to also offer Postgres.
LeetCode offers solutions on almost all their database questions, which is very helpful when you’re stuck on a problem. What’s even better on LeetCode is that you have the ability to comment on each problem and get help when needed.
LeetCode also has a discussion tab which is really a way for you to look at how other users have solved the problem. Each user solution also has the ability to comment so you can learn more about that user’s approach and discuss the tradeoffs between LeetCode’s official solution and the users. In my opinion, this is by far one of the best features LeetCode offers, especially when you get to the more difficult problems. As a data scientist, there are numerous ways to solve problems. But each approach has its trade-offs, so understanding what those tradeoffs are and being able to communicate them are essential for any interview.
Top Hits - Curated top 70
LeetCode also has a Top Hits section where they have selected the best questions for their users. The SQL top hit is called “LeetCode Curated SQL 70”. This is a premium offering but can be found here, https://leetcode.com/problemset/leetcode-curated-sql-70/.
It’s not obvious why LeetCode selected these 70 questions but I would guess it’s because they all have solutions and good user engagement in the discussion sections. Quality is always hard to find, especially if you have a platform that contains 1000+ questions.
LeetCode For Data Science
Overall, LeetCode does provide a good starting point for you to prepare for your SQL data science coding interviews. They have 150+ database questions that are free with the majority containing solutions. The LeetCode platform has great features to help you get better at coding -- official solutions, user solutions, and discussions with other users will help you refresh your coding skills.
The drawback with LeetCode to prepare specifically for a SQL data science coding interview is that their questions seem to only focus on the technical aspects of SQL, meaning that the questions only require the user to join tables or find the 2nd highest value in a dataset. In reality, technical implementation skills are only one part of a data science interview. The main part is to be able to manipulate data, create metrics, implement edge cases, and understand trade-offs. This doesn’t seem to be a focus with LeetCode. Lastly, it would be nice to be able to filter by the company for free rather than upgrade.
Alternative Resources to LeetCode: A Data Science SQL Resource Specifically Designed for Data Scientists
One alternative to LeetCode that was built specifically for data scientists is StrataScratch. StrataScratch was designed to offer data science interview questions that test not only the technical implementation of SQL but also test data science concepts like implementing edge cases, creating metrics, and manipulating the data. Because not only is it important to understand how to manipulate the data but it’s important to implement solutions that compensate for real-life scenarios and edge cases.
StrataScratch has over 500+ data science questions from real companies to help you prepare for the coding portion of your data science interview.
StrataScratch was created because data scientists need questions that would reflect concepts covered in data science rather than in software engineering as LeetCode does.
Helpful Features For Your Data Science Interviews
As mentioned, the content of StrataScratch (i.e., the questions) are specific to data science but the features are similar to many of the modern platforms out there like LeetCode and AlgoExpert. Here are some of the helpful features that you’ll find helpful whether or not you’re preparing for your next data science interview.
Real Data Science Interview Questions From Real Companies (Filter By Company)
The ability to filter by the company is more critical to have for data scientists than for software engineers. This is because the types of data are drastically different company-to-company and industry-to-industry. You won’t be working with the same types of data at Facebook than you would at Salesforce so if you were interviewing at Facebook, wouldn’t you want to prepare with relevant data science questions? Even the type of data science questions Airbnb would ask is drastically different than the type of data science questions Google would ask despite both being in Tech. Being able to parse through these differences is critical for you to best prepare for your next data science coding interviews.
Modern times call for modern solutions -- the trend for this generation is to learn via video. Video is more easily consumable than text but much more time consuming to create. StrataScratch has over 50+ video solutions on their platform to help you prepare for your data science interviews.
Python and SQL For Each Data Science Question
A data scientist has many tools in their tool belt. SQL is almost always required but many data scientists are more comfortable with Python. Most data science interviews allow you to use any language you want so as to give you the best chance to showcase your skills. StrataScratch allows you to solve their data science questions in both SQL and python with solutions for both coding languages.
Other Similar Features To LeetCode
LeetCode has a great feature-rich platform that helps software engineers prepare for interviews. As such, it’s not surprising that similar features would be available for data scientists on StrataScratch that helps to prepare for data science interviews.
The ability to explore other user’s solutions and learn alternative approaches is a major advantage to succeeding on data science interviews.
The ability to also discuss and ask data science questions about each interview question and your code with the StrataScratch community is another big benefit. You’ll learn faster with other people by your side than by yourself. Not only do you get help from the user community but also from the StrataScratch team.
LeetCode is a great platform to help software engineers prepare for the interviews. They have over 1800+ algorithm type questions to help you improve your technical skills. However, LeetCode is not focused on the data scientists and their needs. This is apparent from having only 150 database type questions. Do you believe LeetCode is for data science interview preparation? I used LeetCode many times to prepare for my data science interviews but really only used them at the start of preparation. These days there are many other resources out there that focus on data science.