Types of Data in Statistics
Data is found in every field; whether you are a data analyst, marketer, data scientist, product manager, businessman, researcher, or any other profession, you need to experiment with raw or structured data to generate optimum results. According to the reports, we are generating at least 2.5 quintillion bytes of data per week. Every business runs on data; most companies use data for generating insights, creating and launching campaigns, designing strategies, launching products, and providing a better customer experience.
Let’s consider a use case for understanding types of data. Gadget zone is a personal technology company. It primarily has two products named Laptops and Phones. The company has four teams (2 sales teams, one customer support, and one development team). Let’s understand the data types in statistics using this example.
Types of Data
Quantitative data are expressed in numerical values, also known as Numerical data, making them countable. It provides answers to questions like How much? How many? And How often?
Sales of products made by Gadget Zone are considered measurable, quantitative data. Suppose they sold 20 laptops and 10 Phones.
Quantitative data is further classified into two types:
The term discrete means are distinct or separate. This data cannot be broken into decimal or fraction values. It is countable and has finite values.
The sale of 20 laptops and 10 Phones is countable and measurable and can be categorized as discrete data, but we cannot make sales of 10.5 laptops. Can we? Hence discrete data cannot be in fractional or decimal form.
Continuous data represents information that can be divided into more minor levels, i.e., fractional numbers. It can occupy any value within a range.
According to the use case, the continuous data would be the, let’s say, sales of the laptop. In quarter 1(Q1), it can be Rs 1458760.27; in quarter 2(Q2), it can be Rs 4015760.2, which is continuous data.
Continuous data can be further classified into two types:
Interval and Ratio Data
Gadget Zone’s laptop comes in different variations ranging from Rs 20K – 50K, Rs 50K – 80K, and Rs 80K – 1.1L. Also, internal data values are always equidistant from one another. This means the range of the same laptop in the dollar, i.e., $200 – $700, $700 – $1200, $ 1200 – $1700, has the same distance(difference) between the minimum to maximum price.
A meaningful/absolute zero means that there is an absence of something. For example, 0 degrees Fahrenheit is not the absence of heat or temperature, it is just another number along the temperature spectrum (it does mean it’s pretty cold, though).
There is no true zero point (meaningful zero) or fixed beginning in the interval data. For example, the profit earned can be positive, negative, or zero. Since profit is 0, it does not mean revenue is zero. On the other hand, Ratio data is meaningful/absolute zero. 0 means nothing; for example, 0 sales of the company’s GZX01 phone, which in turn means 0 sales means no revenue. Ratio data can not have a negative scale, unlike interval data which means there won’t be any -20 sales of laptops.
This type of data cannot be measured or counted as numbers. It is also called categorical data. Categorical data consist of words, symbols, images, audio, or text.
“Qualitative data tells about the perception of people”
The qualitative data would be the feedback from customers using Gadget zone products. The feedback cannot be measured in numbers. The feedback response would answer questions related to the product, i.e., What feature/experience of the product they like or dislike, which feature attracted them etc.
The Qualitative data is further classified into two types:
Nominal data is used to label variables without any specific order. We cannot do numerical tasks or give any order to sort nominal data.
The hair color of employees can never be the same; it can be black, brown, blonde, etc. We cannot label color in some order as there is no possible way to order Hair color or gender in categories or a rank-wise manner or from highest to lower.
Ordinal data have natural ordering and have some order. This data can be used for observations like customer satisfaction, happiness, surveys, etc., but we cannot perform any mathematical evaluation on them as they only show sequences.
The sales made by Gadget Zone in the previous year were analyzed during the annual business review to understand customer behavior, satisfaction level, and happiness. This analysis was based on surveys the organization sent to the customers. This data was based purely on customer perceptions, beliefs, and product adoption phases. Such kind of data is not used for performing mathematical evaluations and is data ordinal data.
The employees are ranked based on their performance last year and manager feedback on a scale of Very Satisfied, Satisfied, Neutral, Dissatisfied, and Very Dissatisfied. Here the employee would have some orders based on various feedback.
Another example would be different education levels of employees such as graduation, post-graduation, Ph.D., etc.
Now, the sales profit made by the Gadget zone can only be answered with Yes or No. Dichotomous means something divided into two parts; usually, two appear to contradict.
Thus, in this article, we understand the different types of data which is very important for any statistical analysis. In the upcoming article, we will use the understanding of data types to perform descriptive and inferential statistics.