The Best Way to Learn SQL for Data Science
You don’t know how to learn SQL? No more excuses! This is a detailed guide to the best ways of learning SQL.
We assume you heard about SQL. If nothing else, from people telling you how great it is and how you should learn it. This is very much true; SQL is really great. What is usually missing is the answer to why and, most importantly, how to learn SQL. This is what we’ll answer in this guide.
But first comes the what.
What is SQL?
SQL is short for a Structured Query Language, a programming language created for working with relational databases. In other words, storing, changing, and retrieving data.
It was developed by Raymond Boyce and Donald Chamberlain of IBM in the 1970s. It’s been a standardized programming language since 1986, when ANSI and ISO adopted the standard language definition.
While there’s a standard SQL, virtually every Relational Database Management System (RDBMS) has its own version (or dialect) of SQL. This means most of the standard SQL options are usually there in every RDBMS. However, sometimes dialects don’t allow all the standard SQL features; they offer variation or, even give possibilities that are not there in the standard SQL.
As of May 2022, the four most popular database engines and SQL dialects are:
- Microsoft SQL Server
What Is SQL Used For?
As a language specifically made for communicating with databases, that’s exactly what it’s used for in data science. But what does that mean in practice?It means you can use it for creating and modifying database objects. With SQL, you can create databases and tables, define their columns and data types and populate them with data. SQL also allows you to modify and delete tables and data within.
SQL has commands for managing access to databases and data stored in them.
But the most important features of SQL are those for fetching data from databases, manipulating it, performing calculations, and preparing reports. Data is what data science revolves around. And data is still, most of the time, stored in relational databases. Even if you do everything else in other programming languages, SQL is the best tool for extracting and manipulating data. And because of that, knowing SQL is a prerequisite for any data scientist.
How to Learn SQL
The best way to learn SQL depends on your learning preferences, how much time you have on hand, how familiar you are with SQL, and countless other factors. It’s impossible for us to find the best way for every individual; that’s up to you!
But we will help you to decide. The main question you have to answer is, are you already working with SQL (at least to some extent) or want to work at your current job? Or do you want to learn SQL to get a job that requires SQL skills?
Why are the answers to these questions important? If you want to learn SQL to make your job easier (hint: move from doing everything in Excel) or simply feel you need to improve your existing SQL skills to be better at your job, that means you’re already working with actual data. In other words, you have data to practice on.
If you want to learn SQL to find a better job, you most likely don’t have access to actual data and databases, so you have to compensate for that somehow. You will need learning resources that focus on you writing a code.
Based on that, the choice of the best ways of learning SQL depends on the amount of practice they offer.
For Those Who Already Work With Real Data
The video courses are the closest you can get to an actual lecture. Every respectable video course is made by experts in SQL, in which they explain the theoretical background of the technical concepts and show you how to translate them into the SQL code. You’ll usually have to follow the lecturer by writing the code as they do. Very often, the video courses give you exercises after every lesson, but these courses are more oriented towards explaining, not practicing.
We’ll give you a short description of three resources for SQL video courses and links to some courses they offer. It’s only a starting point. Feel free to browse those sites and find a course that will suit you best.
One of the most popular platforms offering video courses is Udemy. They are also a rich source of SQL courses, ranging from boot camps and separate courses of different levels of difficulty to covering specific SQL topics, uses, and dialects. Currently, there are 8,426 courses on SQL, with 537 of them being free.
The top three most popular courses for beginners are:
One of the great resources for SQL video courses is Coursera. Their courses are most often provided by USA universities or big IT companies. The 1,450 courses, projects, specializations, even professional certificates, degrees, and university certificates; it’s all there.
The most popular courses covering SQL are:
While they don’t offer as many courses as Udemy or Coursera, edX has very detailed and quality courses provided by the distinct US universities and IT companies. Aside from courses, they also offer programs that will earn you a professional certificate or a degree.
They are more focused on some specific topics, often on an advanced level. Some of the individual courses that might get you started are:
One plus of the Youtube videos is that they’re free. If something is free, it doesn’t mean it lacks quality. On the contrary, there are some excellent and expertly done tutorials or the whole channels dedicated to SQL, databases, and other related topics. The videos can range from several-hour tutorial videos to videos explaining a specific technical topic or problem and how to approach it. These shorter videos focusing on one concept can be handy for anybody who gets stuck and wants to refresh their memory on a specific topic quickly. You can see the Youtube videos almost as video cheat sheets.
freeCodeCamp.org is a non-profit organization helping people learn to code for free. Their Youtube channel is a rich resource for anyone who wants to learn to code. You can browse a number of their SQL videos. Some of those might be of interest to you.
StrataScratch’s own Youtube channel focuses on giving you a broader knowledge of data science and the position of SQL within it. The approach is different in the sense that videos usually focus on a specific technical topic and explain it by writing a solution to the SQL interview question. There are also videos giving you more general tips, such as on structuring your SQL code or how to avoid typical coding mistakes. Here are some videos you can start with:
1. Solving an Airbnb Data Science Coding Interview Question | SQL Interview [Fav Host Nationality]
2. SQL Case Statements For Data Science Interviews in 2021
3. Most Common Coding Mistakes on Data Science Interviews
Alex the Analyst
Alex Freberg’s channel is also full of helpful videos. It covers both technical and non-technical topics of interest for data analysts. The videos are neatly sorted by topic. You can listen to Alex talk about portfolio projects, interview questions, and various career advice. Or you can go more technical and watch his SQL tutorial created for different levels of knowledge. The videos are usually below 10 minutes, so learning in smaller steps is easier.
The software engineer that goes by the name Nelson has a Youtube channel sharing career advice and helpful SQL tutorial videos. He teaches SQL on a PostgreSQL platform. There’s a tutorial for beginners and shorter PostgreSQL videos combined into a complete course playlist.
Some people prefer to read about how to do something and then do it themselves, rather than watching a video of someone showing it to them. When reading blogs, it’s also easy to skim through the text and go directly to the point of your interest. However, good articles usually give you a broader perspective and explain something you didn’t even know you needed to have explained. Because of that, it’s advisable to read the articles in their completeness.
In our blog, we strive to cover all the topics important to data scientists, and it's easy to browse through these topics. Under the SQL sections, you can find guides on SQL concepts and articles showing you the steps of answering a specific coding interview question, which is an excellent way to absorb new information.
Here are some suggestions, so you get a feeling of what this blog is all about.
Their blog is abundant with articles that explain all the essential SQL concepts and tricks in shorter and longer versions, ranging from basic to advanced knowledge. They also have an SQL cookbook with concise how-tos for all the most popular SQL dialects.
To give you a feel of their approach, here are some article samples:
If you ever Googled something related to SQL, the chance is one of the first results was GeeksforGeeks. Their tutorials are a great reference point when wanting to know how certain SQL command works. The explanations are always concise, supported by the SQL syntax and a description of how it works in an example. No point in linking several articles, so we can only advise you to have a look at all the SQL topics they covered.
The concept is similar to the GeeksforGeeks one. The main difference is that W3Schools also makes it possible to run the code by yourself and browse through the output. Again, we will simply leave you with a link where you can easily find SQL topic that interests you.
Official Database Documentation
An essential source of SQL knowledge is the official documentation of a database (or SQL dialect) you’re working with. The explanations can sometimes be dry, so maybe they are better for users already familiar with SQL syntax and above the basic SQL level. You should especially consult this documentation when you’re not that interested in general SQL concepts but in how a specific operation can be written in this distinct SQL dialect. Remember what we said: while there are similarities between the dialects, not all the features are available in all of them, and if they are, a different keyword or different assumptions has to be used to achieve the same result.
You would expect that people who made the database know best what it can do. Especially when there are some updates, the official documentation should be a go-to source.
The documentation for all the latest versions:
If no courses, videos, or articles explain exactly what you want to know, if you’re stuck, the only solution is to ask somebody. Not that asking questions should always come last; sometimes it’s even better not to lose time but go with your problem directly to someone who can help you.
Luckily, there is a large online community of SQL users with various levels of knowledge, and they are often very willing and able to help. If they didn’t already help somebody with the same problem as you, you could always post a question. Also, certain websites allow you to get a one-on-one session with the SQL expert for a certain fee to help you with the specific SQL problem.
For Those Who Don’t Work With Real Data (Yet)
Not working with data doesn’t mean the resources mentioned previously are useless to you. No, the thing is, they usually don’t give plenty of opportunities to practice what you learned. Since you don’t work with data, you don’t have the chance to practice what you learned on actual data.
You can always create your own data or use platforms that both explain SQL and are interactive so that you can practice coding.
Make Your Own Data
Some of the possibilities are that you create tables and enter data manually in SQL using the CREATE TABLE and INSERT INTO statement
A slightly faster way to do it is to use Excel or Google Sheets. You write the values you need, and the numeric values can be generated using the RANDBETWEEN() function. Then you can upload it to the database and practice writing code on it.
The thing with making your own data is you usually need to have some experience to know what data will best showcase the concept you want to learn and practice.
Usually, it’s better to look for interactive courses and learning sites.
Interactive learning combines the best from the theoretical and practical world. The theory is there, but not in the suffocating quantities. Detailed explanations are done through the codes you can run and see how it works. Most importantly, you need to write your own code and show that you learned something. After every lesson, there is a number of exercises you need to solve by writing the correct code before going to the next section.
We already mentioned their blog, but LearnSQL.com’s core business is SQL education via interactive courses. There are individual courses and whole tracks that give you a comprehensive and structured path to learning. They also have tracks and courses made only for practicing SQL.There are currently 80 learning and practicing tracks and courses, with the following three being the most popular:
DataCamp, too, offers interactive SQL courses and some other data science courses that might be of interest to you. There are 27 SQL courses and seven guided projects where you can practice what you learned.
The three most popular courses here are:
At Codecademy, you can as well learn much more than just SQL. Basically, every coding language and any other skill used in data science. But let’s stick to SQL. With 15 courses and career paths, this is also a valuable source for learning and practicing SQL.
Some suggestions that’ll probably be interesting for you:
Solving Interview Questions
One of the best ways to learn SQL is to solve as many interview questions as possible from actual companies. Not that you’ll only master SQL concepts without worrying about creating data for practicing. You’ll also be solving real interview questions, so when you get the opportunity for a data science job interview, the chance is you’ll get the same or very similar interview questions like the ones you already practiced.
StrataScratch boasts more than 1,000 real interview questions from the biggest tech companies such as Facebook/Meta, Amazon, Apple, Google, Netflix, Airbnb, LinkedIn, Uber, Amazon; you name it. The coding questions can be filtered according to several categories, and there are also curated SQL topics and topic families.
All the interview questions can be solved in PostgreSQL or MySQL, making it easier to suit your needs.
Every question allows you to inspect the dataset, write your own code, check it, compare it with other users, and discuss it in a thread. The increasing number of questions also has a video solution.
LeetCode is one of the most popular platforms for software development, where you can learn and test your knowledge. They also have a section for learning SQL and answering the actual interview questions by writing the SQL code. The databases they support are MySQL, MS SQL Server, and Oracle. As with StrataScratch, you can write the code and discuss it with other users. The official solution is always there and explained.
The only downside is their SQL questions are not so much data analytics focused. But if you want to learn and practice the SQL syntax, it’s a very useful resource.
SQLPad also gives you the actual interview questions, albeit in a much smaller quantity: 207. Every coding question is accompanied by a video lecture covering the concept(s) asked by the particular question. Here, you can also inspect data sets and write code in PostgreSQL and MySQL.
Another popular resource for practicing SQL and solving interview questions is HackerRank. They don’t offer real interview questions, but there are plenty of challenges that mimic those questions and help you make your coding fingers swifter.
How Long Does It Take To Learn SQL?
Anything between a few hours to eternity.
Taking the length of the video and interactive courses as a reference, learning basic SQL will take you somewhere between a couple of days to five weeks. It, of course, depends on how many hours daily or weekly you dedicate to learning SQL. The learning speed also depends on whether you already have database and coding experience.
Generally, it’s safe to say that you’ll be able to write basic SQL queries in several weeks. Learning something is one thing, but making your knowledge stick is another thing. That’s why experience is of crucial significance. All the tutorials, videos, and articles are only there to give you a basis. Real learning starts when you start applying what you learned in real business situations, when you meet with data and databases imperfections, and when you’re faced with a problem you have to solve.
The great thing about that is this real-time environment allows you to f*** up! While learning from others' mistakes is great, learning from your own mistakes is even more efficient. We’re not even joking here! You’ll for sure make mistakes and, considering they are not fatal, they are the most incredible opportunity to learn. It’s a lesson that sticks: it reminds you what not to do again and how to avoid the mistake next time. That is experience: the name we give to our mistakes, as Oscar Wilde would say it.
For most people, several weeks of basic SQL learning and a month or two of practice will be enough. After that, they’ll for sure be able to select some columns from the database, filter data comfortably, and maybe even do some simple calculations.
But what after that? The good thing is knowing basic SQL unlocks more advanced topics, and they, due to your experience, become easier to learn on your own. However, you’ll never need to use everything SQL does, so there’ll always be something you don’t know, and the learning never stops.
Learning SQL is a continuous activity in which you’ll never be ‘there’. People have used SQL for years and decades. Even they have to accept they don’t know everything. Don’t worry if only in a few seconds or minutes apart, you’ll be able to feel like a total SQL failure and then the master of the SQL universe. That’s normal.
MySQL vs. PostgreSQL: Which Should You Learn?
If you were paying attention closely, you could’ve noticed all the resources we listed here most often concentrate on teaching PostgreSQL or MySQL, sometimes even both.
The reason is that these two are among the four most popular databases. But MS SQL Server and Oracle are, too. Why not teach them? The reason is both PostgreSQL and MySQL are open source databases, while the other two are not.
This makes it easier for users to transition from learning to using SQL in practice. There’s no need to pay for a database to be able to use it and its SQL dialect.
Being free is not the only good thing about PostgreSQL and MySQL. They are both top-notch databases used by major companies such as Apple, Skype, Uber, Netflix, Instagram, Airbnb, and NASA.
Which one should you choose between two free popular SQL dialects: PostgreSQL or MySQL?
Generally, MySQL is better for beginners who want to learn fast and will mainly work with read-only queries.
For more heavy-duty read and write queries, PostgreSQL is better. However, PostgreSQL has a steeper learning curve.
We generally recommend learning PostgreSQL for data science. There are reasons for that, which you can find all about in our comparison between PostgreSQL and MySQL on several topics important to data scientists.
The Web is a magnificent tool—the one we too rarely use to its fullest. There’s nothing wrong with watching videos of cats being cute or people being stupid. But sometimes it’s good to use the time and possibilities we have to learn new skills or improve the existing ones.
Learning SQL is one of the decisions that can have a long–term beneficial impact on your career and everything that goes with it. It could be several hours or weeks until you start writing functional SQL code independently.
Various resources, such as video and interactive courses, Youtube tutorials, blogs, and supporting the community; you can find all on the Internet. And plenty of those resources are available for free.
The same is true for PostgreSQL and MySQL, two of the most popular SQL databases.
We did our best to provide you with good resources for learning SQL. Now it’s up to you! You couldn’t say you didn’t know where to start.