There are some types of data in statistics that we need to understand. It does matter because before choosing the method or analyzing, we need to determine what kind of data is and make a better result in the end.
Before we get started, I would like to ask you to recall again what is the meaning or the definition of data.
Definition of data
Data is a collection of information obtained in a various way. It can be proof of an incident that can be proved by the scientific method. It is not just an assumption or something imaginary or just an opinion. Data is not something that can be made just by prediction.
Data is something powerful that difficult to reject by everyone. To make data can be processed, we need to use systematic an structural way to produce understandable data.
To make it easier, here is some illustration how we use and utilize data :
As the owner of a company, a director needs to do an annual checking and evaluating the performance of the employee. The goal is to give a reward for the best-performing employee, or usually, we call it “Employee of The Year”.
Based on some criteria, the director gets the best 3 of the candidate. They are Alex, Sandra, and Dennis.
Alex is a humble and polite person. He is always kind to everyone, even the cleaning service. He always smiles when he greets people. Everybody loves him. But, based on the administrative record, Alex had been punished by the supervisor because he got drunk once.
For several reasons, it is not a big deal but as an employee, he needs to know how to behave in the office, especially in office hours.
Sandra is another candidate. She is a disciplined girl who never been late for work. Sometimes, she works overtime without being told. Her job is always excellent.
She always stands out and full of the creative idea. But administrative record told that she once sent w wrong letter and caused a loss to the company. Not in big amount, but it still is a career record for her. Because of this, the director did not choose her.
Dennis is the last candidate. In a glance, Dennis looks like an average employee. But he does his job perfectly. He did not talk too much but always giving ideas when everybody were stuck.
He always comes on time and back in time with a magnificent result. There are no big deals on him even sometimes the director did not notice him.
Finally, after numerous consideration, the director chooses Dennis as an employee of the year. Many people are surprised by this decision. Some people even sent the protest. But the data already told that Dennis won all of the criteria and worth to be entitled.
That is how data works and give people’s benefit. It helps us to decide which the best decision to make.
Decision-based on data is something undeniable. That’s how regulation and policy should be made. Not only just by instinct and opinion, but also with solid data.
Why we should know the types of data in statistics?
1. In order to identify the problems correctly
Knowing the types of data will make us understand and choose the proper treatment, including choosing the right data collection techniques so you can design the right formula and find the solution to your problem.
2. To identify the right analyze tools
Every data have their own characteristic. This difference makes them have a unique method in specific purposes. By knowing this kind of data, we can pick the right analyzing tools to solve a problem or case.
3. To produce the stronger result
Yes, of course, the purpose of knowing and analyzing is making a great result and solution. But it can not be done if we have not known what is the types of data. We always want to make an effective, accurate, and efficient result, right?
Types of data in Statistics
Data by its nature
Overall, by its nature, there are two kind of data :
1. Qualitative Data
Qualitative data is a kind of presented data in the form of a number, not by words, or in the category, and something like that.
Let me give you some examples; data about consumer satisfaction which contain three kind of categorized, very satisfied, satisfied, and not satisfied. It often used in social or psychology analysis.
2. Quantitative Data
Quantitative data is data which have value or amount. Briefly, it is something that can be counted or calculated. Examples, weight, high, household income, expenditure, etc.
Every quantitative data has values so we can measure it by exact judgment, not just by opinion. There is a lot of subjects which use quantitative data as a quantitative approach.
Data based on scale of measurement
Based on the scale of measurement, there are four types of data in statistics
1. Nominal data
Nominal data is the lowest and the easiest to understand data. It can be classified as the simplest data. It tends to be easy to remember because there are no specific differences or requirement. For example, we are using gender as a subject of research. We classified it by giving code 1 to male, and 2 for female. There is no difference or level among them. They are equal.
2. Ordinal Data
Ordinal is data which have one higher level than nominal which shown hierarchy and rank. By using these types, we can measure many things that hard to count before.
Examples, rank of student at school, score grading in university, etc.
3. Interval data
Interval data is a measurement scale where we are not only just considering the level, but also the certain value. Interval data does not have an absolute zero value so it is not comparable.
A simple example of interval data is temperature. 100 celsius, 200 Celsius, and 800 celsius have a different level of heat. But it does not mean 200 celsius is twice hotter than 100 Celsius.
4. Ratio data
Ratio data is the highest level of data. This scale has all characteristics of the types of data and has all the benefit of data. With ratio data, we can use as much as possible to measure and analyze the case in many possible tools.
Remember, ratio data has an absolute zero value.
Example: data of people weight. 80kg means twice weight than 40kg, and so on. That’s how we use ratio data for research.
Data based on sources
Now, let us see the types of data by the sources. If we categorized it, there are two kinds of data sources in statistics :
1. Internal data
Internal data is data that already owned by institution or company and it is ready to use without any process that needs external help. Example, the company’s data for salary and employee, government’s expenditure data, etc.
2. Eksternal data
External data is a needed data by the company or institution but unfortunately, it is not available. So, to fulfill it, they need to do some kind of collecting data and information which sourced outside of the institution. For example, the data about the market’s response to a new product, people opinion about the general election, and others stuff.
To make it easier, let me try to give an comprehensive example !
Amazon is an e-commerce company which provides in wholesale and retail trade of various trade in the world. In the process, Amazon keeps gathering data to make the right decision for the company. In this case, data about a number of products, product classification, product by location, and others.
Meanwhile, kind of data likes of a popular product, most complained product, lower rates product needs external participation, in this case, is visitor or customer who bought in Amazon.
Data based on collection process
By the collection process, data is categorized by :
1. Primary data
Primary data is a collection of data which is collected by personal, company, or institution by their own self. The method is various such as phone call, direct interview, field inspection, and others.
Primary data take quite efforts than secondary data. Example of primary data is data for research about household income, we need to come one by one because there is no resource who provide this kind of data.
2. Secondary data
Secondary data is a set of data which already available in a lot of resources but we need to come or ask it on our own self. As a user, we just need to collect it and do not forget to publish the data’s source. Example of secondary data is world bank data, data from the statistical agency,
Data based on collection time
By the time of collection, data is categorized by :
1. Time series data
Time series data is data which collected for specific purposes by observing the trend or changing based on the time. Time series focuses on analyzing the behavior of data by time’s effect. For example, the height of a man from 5 to 15 years old, economic growth since 1970, others.
2. Cross Section Data
Cross section data is data collected at one point time, but considering many condition and aspects or classification. For example, sex ratio, level of education, life expectancy rate, and others in 2018.
Knowing the types of data will make you understand that even a simple thing like this is really important to determine your analyzing process. It will make you wiser and conduct a better conclusion at the end of your case.