Descriptive Statistics: of the Art of Exploring and Other Issues Part II

In the last post we talked about the types of variables (qualitative vs quantitative) and the levels of measurement of the qualitative variables: nominal and ordinal. In this post I will talk about the levels of measurement of quantitative variables and I will leave a brief exercise to strengthen these concepts before we see tools for exploring data.

Interval

The interval variables are those in which the data is classified on an arbitrary scale and where each value of the scale corresponds to a category. These categories are characterized, as are the groups of the ordinal variables, of being mutually exclusive and collectively exhaustive; however, unlike the ordinal variables, they follow a logical order that corresponds to the magnitude of the scale associated with the characteristic. For this type of variable the zero does not imply a lack of the property.

Let’s see an example with the variable “temperature of the human body in Celcius Degrees.” First, it would be bizarre to find an individual that has a temperature of 37 ° C and simultaneously 40 ° C. That is, the individual has a fever or does not (mutually exclusive).

Second, we know that 40 ° C is higher than 37 ° C (similar to the logical order of ordinal variables). Third, the difference of 3 degrees between the temperatures of two individuals (40 ° C subject A minus 37 ° C subject B) has the same meaning as the difference between 37 ° C of subject D and 34 ° C of subject E, i.e. both subjects (A and D) are 3 degrees warmer than subject B and subject E, respectively. In other words, the difference between the two magnitudes has the same interpretation in any part of the scale.

Let’s see how this property does not apply to qualitative variables. For example, we can not say that the difference between an individual with a level of satisfaction with the service of 10 and one with 5 is the same difference between an individual who answered 5 and another who responded 0. One of the implications of this is that with the interval variables (as with the ratio variables) we can perform basic mathematical operations such as adding or subtracting and the result can be interpreted. Finally, zero does not imply that the object lacks temperature. In fact, the zero on the Fahrenheit temperature scale is 32 ° F. The above means that we can not perform mathematical operations such as multiply or divide. For example, we can not say that the temperature in Cali at noon (assuming it is 30 ° C) is twice as hot as Bogotá’s temperature at 3 o’clock in the afternoon (considering it is around 15 ° C).

Ratio

The ratio variables are very similar to the interval variables except that the scale in which the data is classified is not arbitrary and the zero does imply the lack of the characteristic. Let’s see an example, suppose that the variable of interest is the monthly average labor income. In that case, a person can not answer that he earns 4 million pesos and 2 million pesos on average each month, or 4 or 2, but not both at the same time. Maybe the individual earns 4 million for rendering services and 2 for his fixed-term work. This is when the job of the interviewer is fundamental, and he should ask the individual questions such as if the income received for the provision of services is constant (that is, every month), if the magnitude he receives is always the same or if it is variable. For example, suppose he answered that the value is constant and that it gets that value every month of the year, in which case the interviewer should write that the income is 6 million pesos on average per month. On the contrary, if those 4 million pesos were something extraordinary of the last month, but almost never happens, then theinterviewer should note that the income is 2 million pesos on average per month.

The reason variables follow a logical order, and we can perform mathematical operations with the values ​​such as add, subtract, multiply and divide. That is, we know that a person who received 10 million pesos a month, you receive five times more a person who receives 2 million pesos per month. Besides, zero implies the lack of the characteristic, that is, answering 0, in our example, means not receiving labor income.

Exercise

Suppose you work in a Chinese food restaurant in the shopping area. You have information in an Excel file about 100 suppliers that the company has purchased over the last ten years (some permanently, others are recent suppliers and others are not buying them anymore). In an Excel sheet you have information about the name of the supplier, the company’s corporate name, the company’s NIT, the date on which the business relationship began, the mobile phone number, and the location address of the head office from the supplier’s company, means of payment accepted by the supplier and the status of the provider (Active and Inactive).

The file also includes a sheet that contains daily information on the number of purchase transactions per supplier and the amount of each transaction in pesos, the amount paid and the amount owed. Also, each transaction is associated with an invoice number and includes the value of the discount (if applicable), and the value corresponding to the VAT.

The company also has the biannual assessment of the suppliers given the service received. This valuation is done through a survey. It asks the employees to rate the service provided by the provider using a scale from 1 to 5 (Likert scale), where one is unsatisfied and five is totally satisfied.

Identify the variables that are present in the database described (give them a name) and classify them according to the type of variable (Qualitative or Quantitative) and indicate their level of measurement. The solution in the next post, if you have any doubt do not hesitate to write us.

Leave a comment

en_USEnglish