Data Literacy - 10 Things You Should Know
What is data literacy? What should you know to become data literate? We are about to answer these two questions that might be bugging you for a while.
You like reading, right? Well, you wouldn’t read this article if you hadn’t. OK, you’re right: reading this article doesn’t mean you love reading. But reading it for sure means you are, at least partially, literate.
The ‘problem’ with reading is once you learn to read, you can’t unlearn it. Not only that, but you can’t willingly not read once your eyes meet the letters. You think you can? Let’s do a little experiment, shall we?
Take a look at the image below, and…don’t read what it says! Ah, too late; you already did.
Well, I’m sorry. You failing the experiment simply proved that your brain can’t decide to turn your reading skill off when your eyes see letters. And there you go, you read something before you even think of not reading it.
Reading is only one part of literacy, the second one being writing.
Is it the same with data literacy?
What is Data Literacy?
Data literacy is, as Wikipedia says, the ability to read, understand, create, and communicate data as information.
As with literacy in general, data literacy also involves reading and writing. In this case, it's reading data and creating it. This means you should be able to talk about data, use analytics programs and interpret the outputs of such analyzes.
What Makes Data Literacy?
You probably heard it being mentioned around, both by people who know what data literacy is and those who don’t. To ensure that you belong to the first camp, I’ll talk about ten things that encompass data literacy.
What you need to know about data literacy can be divided into three blocks:
- Storing Data
- Handling Data
- Presenting Data
While the extent to which you might need the 10 points I’ll talk about depends on your job, you should have at least some basic knowledge about them to achieve data literacy.
Here I’m not talking about technical specifics of the tools for storing data. This is more of a conceptual level, a necessary first step in becoming data literate.
1. Data is Imperfect
Before even getting into details about data, the first thing you should know (and always remember!) is that data is imperfect.
When I say imperfect, there are two things I have in mind. The first is that there’s no perfect way to store data. The companies are growing, changing their approach to business, and evolving according to the technical innovations available to them. This is not a one-way road because business changes also drive technology change. Keeping in mind that constant change and the fact that storing data has to satisfy the need of many different users, you can’t expect that data will be tailored exactly to your needs.
Take the relation databases, one of the most common ways of storing data. Their nature forces data to be as segmented as possible, which leads it to be stored in multiple tables. Maybe it would be ideal for you that those several columns you use every day are simply stored in one table. However, data is not just for you, and you somehow need to overcome this inconvenience. How? That’s what we’ll talk about later.
The second aspect of data imperfection refers to the imperfection of data itself. Data is always, without exception, full of errors and inconsistencies. If you didn’t notice them, it doesn’t mean they are not there. That’s something you should always have in mind. So if you’re looking for data to give you 100% correct answer, you’ll never get that. Don’t be that type of boss that asks their employees if the data they provided you is 100% accurate. It’s not, and it’ll never be.
The point of data is to give you the most reasonably correct answer you can get. How do you achieve that? That’s our second point.
2. Know What the Data Quality is For
Getting reliable information from your data depends on the data quality. While data is not perfect, you should always strive to make it as errorless as possible.
You should also know how data quality (or lack thereof) reflects on data. Here you can talk about data inconsistency. For example, the addresses of your customers can come from different sources, and you could end with data that shows one address for some customers while the other table shows different addresses.
It could also mean that your data is incomplete. You could have the address of your customers, but without the city for some of them.
Data accuracy doesn’t refer to having or not having data, but whether the data you have is actually correct. Do you have the correct addresses of your customers, or are some of these addresses not valid anymore?
The data also has to be precise. Do you round your record sales, or you’re storing only dates without time? You could be missing a lot.
Finally, your data could simply be missing, or it could be unknown. Well, suppose you don’t have your customers’ contacts or don’t record their previous purchases. In that case, there’s no way you can try to make special offers to them, retain them deliberately, or provide them with any information that could be useful.
3. Data Types
Those who are data literate know that there are many types of data stored, and this data often requires specific types of data.
When the correct data type is chosen, it makes data suitable for use. You should be aware that there are numeric values, and usually, one numeric data type is not always suitable. For example, if you’re selling books, you could store the number of books sold as a whole number. But if you’re selling raw materials, maybe a decimal number would be better, so you can record the weight of the material sold. Knowing that you can choose between many types of numerical data and the levels of precision (e.g., how many decimal places you want to see) is vital for data literacy.
Some dates could be stored only as dates because you don’t need any other info than that. What if you’re recording the log-in times to your app? You might want to see the time. But how precise? Should it be only hours, or hours and minutes, maybe even seconds or milliseconds?
Text data is also stored in your database. Depending on the text you want to have, you’ll also have to know that you can determine the type of characters you store or the number of characters you want to store.
While knowing these data types in detail might be or might not be necessary for your job, at least be aware that data is not just lumped together, regardless of its format.
The next step in data literacy is knowing how data can be handled. Not only in theory but also apply certain skills to do it.
Handling data means creating, manipulating, and analyzing it. Three skill types make you able to do that.
4. Technical Skills
These technical skills include choosing the proper technique to analyze data. But first, you have to know what type of analysis you want to perform. Is it descriptive or diagnostic analysis? Maybe even predictive or prescriptive?
The descriptive analysis looks at the historical data and describes what happened. The diagnostic analysis does the same thing, but it tries to give an answer to the question, “what happened?”.
As you could assume, predictive and prescriptive analyzes look into the future. The predictive analysis uses historical data and machine learning to answer what will happen in the future. The next level of predictive analysis is prescriptive analysis. You used predictive analysis to know what would happen. The prescriptive analysis tries to answer “what should you do?” to avoid or benefit from what will happen.
To perform data analysis, you should then use the right statistical technique. They are:
- Descriptive Statistical Analysis
- Inferential Statistical Analysis
- Predictive Statistical Analysis
- Prescriptive Statistical Analysis
- Exploratory Data Analysis
- Casual Analysis
- Mechanistic Analysis
5. Choosing and Using Suitable Tools
It’s highly unlikely you’ll be doing your analysis using only a pen and a piece of paper. No, you’ll have to choose the most suitable tool to achieve what you want. To make the right decision, you have to know which tools are good for what. And, of course, you should know how to use them.
Should you use only one tool or combine several? Sometimes working with a tool doesn’t require coding knowledge, but it still requires knowing the possibilities of a tool and experience in using them. Such tools are Excel or GoogleSheets, Power BI, Tableau, and other BI tools.
But more usually than not, they will require knowledge of at least one programming language. Some of the most popular ones are:
6. Business Skills
While technical and tools knowledge is essential, those are not standalone skills. They need to be combined with business knowledge. Otherwise, you might not understand what business problem you’re trying to solve. And if you don’t understand the question, you will provide the correct answer only by chance.
Business knowledge includes understanding your company and its products, knowing what different departments do, what the business processes look like, and which challenges every department and the company is facing. This also means you need to understand the company’s industry itself and its position compared to its competitors.
For instance, if you want to improve sales, you need to know what the market is, what the competitors are doing that you could do better or different, who your customers are and what could make them attracted to your business. You see, data doesn’t exist by itself, and for itself, it’s a business tool that should be used as such.
While the presentation isn’t everything, it for sure is something. Data is seldom presented in a raw form. What’s the use of that? I mentioned that in business, data is just a tool. For a tool to be usable (meaning: making business decisions based on it), the way data and its insight are presented should be eye-friendly.
7. Visual Skills
The visual skills include both creating and reading data presented in an understandable way.
If you’re one that’s performing analysis, you should have a certain feel of how to visually show data so that those who read it understand it as easily as possible. Yes, this includes having a slight sense of space and colors. That will be useful when you show your insights in the form of nice little tables, highlights, and charts.
On the other hand, data presented in such a way won’t mean anything to those who can’t read it. This is where the data literacy kicks in again. What’s the point of having a dashboard with charts, trendlines, means, averages, deltas, distribution, and whatnot if you don’t even know what that is and how it translates to the business.
8. Social Skills
The social skills boil down to being aware that other people have different experiences than you, different knowledge, and different levels and types of expertise.
This can have a two-fold effect. First, your way of thinking and understanding is not the only (right) one. Second, knowing what makes others tick can make you better at what you do.
How is that connected to data literacy? More than you think! If you’re a data scientist, your social skills should make you adaptable. While being aware that you can’t present your findings to your data science colleagues, people from the sales and management board the same way seems obvious; trust me, many show a lack of understanding this in practice.
It goes the other way round, too. Maybe you’re not that technical, but the level of data literacy and your social skills will make you understand how those working with data think. And if you do, your requests to them can be much more precise and straightforward; you can understand their suggestions and concerns. In return, it will make you work together much more efficiently. Gone will be all those endless emails when you request one thing, get something different, then try to explain that you meant something else, and on, and on.
9. There’s No One True Answer
The presented data can be interpreted in several ways. The interpretation of it can depend on the data itself. For example, if you’re analyzing sales and have data only for the last two years, you can come to one conclusion. If you had a 5-years sales history, you could come up with some different conclusions, even completely the opposite ones.
Even if you have the most detailed data, it could be interpreted differently. For instance, if you had a 20% revenue drop and you’re in hospitality, it could be seen as a disaster. If you know that that year there was a COVID-19 pandemic, the record levels of oil prices, the average drop of revenue for the industry was 70%, three of your five direct competitors closed the business, and you didn’t lay off any of your employees, 20% drop in revenue could be seen as a success.
10. Critical Thinking
With all the technology available, improving data quality, the sheer amount of data, and the ways of analyzing it, it’s easy to forget where we started: data is imperfect!
While critical thinking should be involved in all aspects of data literacy, it’s especially important when making decisions based on data. There’s a tendency to sponsor data-driven decisions, which could lead to relying only on data when making decisions. While making decisions based on a hunch is a hit or miss, making them solely on data while ignoring reason and critical thinking is a data-driven way to a disaster. You might be better off if you hadn't data at all.
In that way, data literacy is the same as general literacy. The interpretation of what you read, what you learn from it, and how it changes what you do depends on you. Making decisions based on data, without thinking, would be like reading Dostoevsky’s “Crime and Punishment” and thinking it's best you buy an ax and…well, you know how this ends!
From these 10 talking points, you saw that data literacy is not much different from general literacy. It too includes reading and writing, but in regards to data.
Being data literate includes a general understanding of how data is stored. Once you know that, you should build technical and other skills to handle data. On top of that, you need to know how to present and read data.
You don’t need expert-level knowledge in all that. The extent to which you’ll need it depends on your interests and job specifics. However, at least a basic understanding of all these 10 points is crucial if you want to call yourself data literate.