Statistics are used throughout everyday life and in a wide variety of research across a multitude of disciplines and subjects. This article discusses each of the basic descriptive statistics calculated by standard statistical analysis software such as SAS, SPSS, STATA, Excel, etc., and how to interpret them. Descriptive statistics (mean, median, mode, range, standard deviation, variance, skewness, and kurtosis) are useful for exploring and examining data (such as how data are distributed or dispersed) prior to performing statistical tests and subsequently carrying out statistical analysis and data interpretation. Descriptive statistics are useful to help explore and examine data before performing statistical tests on the data. Charts, graphs, scatterplots, histograms, etc., are also helpful for visually exploring data and statistics.
N - the total number of observations or cases in a data variable set. When dealing with populations and samples in statistics, "N" usually denotes the number of observations in a population and "n" typically represents the number of observations in a sample.
Mean - the arithmetic average across the distribution of the data set. It is calculated by summing all the values in the particular variable and then dividing the sum by the total number of observations (N) in that variable field. The mean is important in statistics and many statistical tests, such as the t-test.
Median - the observation value that falls exactly in the middle of the variable distribution if all observations are arranged chronologically from lowest to highest. If two observations share the midpoint (in a distribution with an even number of cases), then the average of those two observations is calculated and this particular average then represents the median.
Mode - the value or observation that occurs most frequently in a data set.
Range - the difference between the lowest value and highest value in the variable. In some software the Range is reported in the output, but in other software programs the actual minimum and maximum values are reported.
Standard Deviation - measures the spread of observations about the mean. In statistics, it is the root mean square deviation of values from the arithmetic mean. The larger the standard deviation, the more spread out the observations are. It is also the square root of the variance. Since the standard deviation is in the same units as the original variable, it is easier to interpret than the variance. The standard deviation is the most common measure of dispersion in statistics, which explains how widely the values in a data set are spread around the mean.
Variance - the average of the squared differences between data points and the mean. The variance is a way to describe to what degree a distribution is spread out. It is also the standard deviation squared.
Skewness - measures the degree and direction of symmetry or asymmetry of the distribution. A normal or symmetrical distribution has a skewness of zero (0). But in the real world of statistics, normal distributions are hard to come by. Therefore, a distribution may be positively skewed (skew to the right; longer tail to the right; represented by a positive value) or negatively skewed (skew to the left; longer tail to the left; with a negative value).
Kurtosis - measures how peaked a distribution is and the lightness or heaviness of the tails of the distribution. In other words, how much of the distribution is actually located in the tails? In statistics, a normal distribution has a kurtosis value of zero (0) and is said to be mesokurtic. A positive kurtosis value means that the tails are heavier than a normal distribution and the distribution is said to be leptokurtic (with a higher, more acute "peak"). A negative kurtosis value means that the tails are lighter than a normal distribution and the distribution is said to be platykurtic (with a smaller, flatter "peak").