Statistical MethodsVariability by Adam J. McKee Using F. J. Gravetter and L. B. Wallnau's Essentials of Statistics for the Behavioral Sciences (4th Ed.). What is Science? The essence of science is the study of differences among phenomena—the study of variation. For this reason measures of variation are of highest importance to social scientists. Making Sense of Numbers
Empirical methods provide more objectivity than other methods of knowing. Studying sets of numbers is unwieldy. For understanding, it is usually necessary to summarize the sets of numbers in two ways: Measures of Center Measures of center include the mean, median, and mode. These measures provide a "snapshot" of a group of numbers—a summary. While a valuable tool, they can sometimes be misleading when taken alone. Measures of variation provide us with a more accurate summary of our numbers. What Is Variability? Variability simply means how spread out a group of numbers is from the average. The terms "dispersion" and "spread" are commonly used to mean the same thing as "variability." A measure of center and a measure of spread provide enough information to adequately describe a group of numbers. Utility of Measuring Variability In 1987, the median family income in the USA was $30,853. This does not reflect the fact that almost 12% of families received less than $10,000. It also does not reflect that the top 5% earned over $86,000. Knowing these percentiles, which are measures of variability, provides a clearer picture of family income in the US. Minimum & Maximum Value The minimum and maximum value of a set of scores give us an indication of the spread of scores. Example: Weather reports often give the high and low temperatures for a 24 hour period. Hammond, LA: High: 48°F Low: 28°F Bangor, ME: High: 29°F Low: 15°F Range The range is a measure of variability that determines the difference between the minimum and maximum values in a set of scores. Range = Max – Min + 1 Advantage: The range is easy to calculate Disadvantage: Sensitive to extreme scores Range Example Scores: 10, 11, 12, 13, 14, 15, 16, 17, 18 Range = 18 – 10 + 1 = 9 Scores: 10, 11, 12, 13, 14, 15, 16, 17, 90 Range = 90 – 10 + 1 = 81 Note the effect of the extreme score (90) in the second range calculation! Interquartile Range If we measure the distance between the 25th and 75th percentile, we obtain the interquartile range. The interquartile range "chops off" both ends of the distribution. This eliminates the effects of extreme scores. Interquartile range = 75th Percentile – 25th percentile Deriving Standard Deviation The range is problematic because it is based on only two extreme values that might not reflect the general "spreadoutness" of scores in the distribution. A better way would be to look at how much each score differs from the central tendency of the distribution. Deviation Scores If we subtract each score from the mean, we get a deviation score. Deviation scores tell us how far each score is from the center of the distribution (mean). Mean Deviation Scores When we wanted to get a summary of all scores, we chose to look at a measure of central tendency, which we consider the "typical" value. The same logic holds true for getting an idea of what the spread of scores is like. We can look at the central tendency of the deviation scores. This is a great idea, but the rules of algebra just won’t let us do it. Problems With Mean Deviations Because the mean is the exact center of the distribution, half the deviations will be positive and the other half will be negative. The deviation scores will always add to zero. According to the rules of algebra, we cannot divide by zero. This means that the mean of deviation scores cannot be calculated! Mean Deviation Example Getting Rid of Zero Since we cannot divide by zero, we need to get rid of those annoying negatives so the result will not come to zero. If we simply square the deviation scores, that nasty negative signs go away. A negative number multiplied by a negative number is always positive after all. Sum of Squares The sum of the squared deviation scores is known as the sum of squares. The following example shows the results of squaring the deviation scores:
Mean of Squared Deviation Scores This seems to have worked nicely except for one small problem. We are not interested in the spread of squared deviation scores; we just want to know about plain old scores.
If we want to get rid of a square, all we have to do is take the square root of it and we get back to just deviation from the mean scores. That is, taking the square root returns us to our original unit of measurement (e.g., feet, inches, degrees). Therefore, we add the square root to the formula to get:
Standard Deviation One small adjustment must still be made for reasons of statistical theory. Because of this is a sample statistic that we want to generalize to the population, we must subtract one from N to get the Sample Standard Deviation (SD) formula as follows:
Variance Variance is the standard deviation squared. Disadvantage: Not intuitively interpretable. Advantage: Several statistical / mathematical advantages for advanced statistical tests. Heuristic Example Suppose you are looking for a nice, warm place to retire. You notice two cities that have an average yearly temperature around 72 degrees: Phoenix and Tampa. How do the variance measures reported below help you make your decision?
Standard Deviation Step by Step Step 1a: Construct a table of raw scores Step 1b: Sum the raw Scores Step 1c: Compute the Mean Step 1d: In the second column of the table, Compute the deviation from the mean of each score:
Step 2 In the third column of the table, square the deviation scores.
Step 3 Sum the squared deviations to get the "sum of squares"—symbolized as SS.
Step 4 Compute the Sample Variance
Note that the term N -1 is used to correct the problem of samples having a bias in variability. The effect is to increase the value you obtain. Step 5 Compute the sample Standard Deviation
SS Computational Formula By tricks of algebra, statisticians have come up with a way of making the computation of SS easier than the "heuristic" formula previously presented—The result is exactly the same. Example
This page available at: |