Measure of central tendency (a value around which other scores in the set cluster) and a measure of variability (an indicator of how spread out about the mean scores are in a data set) are used together to give a description of the data.
The terms variability, spread, and dispersion are synonyms, and refer to how spread out a distribution is. Just as in the section on central tendency where we discussed measures of the center of a distribution of scores, in this chapter we will discuss measures of the variability of a distribution.
You are viewing: Which Variable Has More Dispersion Why
Measures of dispersion describe the spread of scores in a distribution. The more spread out the scores are, the higher the dispersion or spread. In Figure 1, the y-axis is frequency and the x-axis represents values for a variable. There are two distributions, labeled as small and large. You can see both are normally distributed (unimodal, symmetrical), and the mean, median, and mode for both fall on the same point. What is different between the two is the spread or dispersion of the scores. The taller-looking distribution shows a smaller dispersion while the wider distribution shows a larger dispersion. For the “small” distribution in Figure 1, the data values are concentrated closely near the mean; in the “large” distribution, the data values are more widely spread out from the mean.
Figure 1. Examples of 2 normal (symmetrical, unimodal) distributions.
In this chapter, we will look at three measures of variability: range, variance, and standard deviation. An important characteristic of any set of data is the variation in the data. Imagine that students in two different sections of statistics take Exam 1 and the mean score in both classrooms is a 75. If that is the only descriptive statistic I report you might assume that both classes are identical – but that is not necessarily true. Let’s examine the scores for each section.
Read more : Which Concept Does The Diagram Show
Section A Section B Scores = 70, 70, 70, 70, 85, 85
Mean = 75
Scores = 70, 72, 73, 75, 75, 85
Mean = 75
Table 1. Exam scores for 2 sections of a class.
Comparing both sections you can see that the scores for Section A very few scores are represented (e.g., 70 and 85) and they are very far from the mean, while in Section B more scores are represented and clustered close to the mean We would say that the spread of scores for Section A is greater than Section B.
Range
Read more : Which Of These Statements Describes A Rebate
The range is the simplest measure of variability and is really easy to calculate.
You can see in our statistics course example (Table 1) that Section A scores have a range of 15 and Section B scores have a range of 15. That means all the other scores are not included and may not give an unbiased description of the data.
The range is the simplest measure of variability to calculate, and one you have probably encountered many times in your life. The simplicity of calculating range is appealing but it can be a very unreliable measure of variability. We noticed earlier that the spread of score for each section was very different for each section.
Let’s take a few examples. What is the range of the following group of numbers: 10, 2, 5, 6, 7, 3, 4? Well, the highest number is 10, and the lowest number is 2, so 10 – 2 = 8. The range is 8. Let’s take another example. Here’s a dataset with 10 numbers: 99, 45, 23, 67, 45, 91, 82, 78, 62, 51. What is the range? The highest number is 99 and the lowest number is 23, so 99 – 23 equals 76; the range is 76. Again, the problem with using range is that it is extremely sensitive to outliers, and one number far away from the rest of the data will greatly alter the value of the range. For example, in the set of numbers 1, 3, 4, 4, 5, 8, and 9, the range is 8 (9 – 1). However, if we add a single person whose score is nowhere close to the rest of the scores, say, 20, the range more than doubles from 8 to 19.
Interquartile Range
A special take on range, is to identify values in terms of quartiles of the distribution (remember chapter 2 with boxplots). The interquartile range (IQR) is the range of the middle 50% of the scores in a distribution and is sometimes used to communicate where the bulk of the data in the distribution are located. It is computed as follows: IQR = 75th percentile – 25th percentile. Recall that in the discussion of box plots in chapter 2, the 75th percentile was called the upper hinge and the 25th percentile was called the lower hinge. Using this terminology, the interquartile range is referred to as the H-spread.
Source: https://t-tees.com
Category: WHICH