For both a population and a sample data set, the standard deviation can be calculated.Â  The standard deviation for a sample data set is determined based on a smaller random sampling, and while this is much easier to determine than the standard deviation for a population set, it is less accurate.Â  When considering the entire population, the standard deviation can be calculated with a greater degree of completeness and precision.Â

Â

We consider the standard deviation to determine how much of our data is within a certain range with limits above and below the mean of the entire data set.Â  The standard deviation for a population data set is defined in the same terms or units of the data for that population and is a distance that is drawn creating an upper and lower range with respect to the mean.Â  Adding additional standard deviations to expand the upper and lower limits of data will increase the number of data sets that occur within the range.

Â

How many data sets are captured within one or more standard deviations can be determined manually by plotting data points but for a large population set this can become tedious.Â  Instead, using simple approximations can be used to estimate what percentage of the data sets are included within the upper and lower limits created by one or more standard deviations. Â Two rules we consider for these estimates are Chebyshevâ€™s theorem (which applies to any population data set) and the empirical rule (which applies to a symmetric set of population data).Â  For the upper and lower limits produced by one standard deviation measured above and below the mean, the empirical rule states that roughly 68 % of the data sets are included.Â  For the limits produced by two and three standard deviations, Chebyshevâ€™s theorem states that about 75 % and 89 % of the data sets respectively, are included and the empirical rule states approximately 95 % and 99.7 % of data sets respectively, are included.

Â

The standard deviation for a population data set is calculated with the following formulation.

Â

Ïƒ = [ Î£(x â€“ Î¼)2 Â / N ]1/2

Â

The standard deviation of the population data set is Ïƒ, each data piece of interest is x, Î¼ is the mean for the entire population data set, Î£ means the summation for all pieces of data for the population and N is the total number of data sets included in the population.

Â

Letâ€™s see an example to apply this concept.

Â

There is a company that gives each of its 15 employees an assessment test to determine their personality types as they relate to their strengths and weaknesses in customer service.Â  The scores (out of 100) from each employee are the following: 43, 61, 59, 60, 39, 76, 54, 67, 72, 80, 56, 58, 65, 70, 55.Â  What is the standard deviation of this set of scores?

Â

For this population set, we will use, Ïƒ = [ Î£(x â€“ Î¼)2 Â / N ]1/2.

Â

Î¼ = (43 + 61 + 59 + 60 + 39 + 76 + 54 + 67 + 72 + 80 + 56 + 58 + 65 + 70 + 55) / 15

Î¼ = 61

Â

N = 15 (the number of data sets in the population)

Â

Î£(x â€“ Î¼)2 = (43 â€“ 61)2Â  + (61 â€“ 61)2Â  + (59 â€“ 61)2Â  + (60 â€“ 61)2Â  + (39 â€“ 61)2 Â +

(76 â€“ 61)2 + (54 â€“ 61)2 + (67 â€“ 61)2 + (72 â€“ 61)2 + (80 â€“ 61)2 + (56 â€“ 61)2 + (58 â€“ 61)2 +

(65 â€“ 61)2 + (70 â€“ 61)2 + (55 â€“ 61)2

Â

Î£(x â€“ Î¼)2 = 324 + 0 + 4 + 1 + 484 + 225 + 49 + 36 + 121 + 361 + 25 + 9 + 16 + 81 + 36

Î£(x â€“ Î¼)2 = 1772

Â

Ïƒ = [ Î£(x â€“ Î¼)2 Â / N ]1/2

Ïƒ = [ 1772Â / 15 ]1/2

Ïƒ = [ 118.13 ]1/2

Ïƒ = 10.87

Â

Â

REFERENCE:

Triola, Mario F. (1992).Â  Elementary Statistics (5th ed.).Â  USA: Addison-Wesley Publishing Company, Inc.