In genetics, an allele is a variant of a gene. A genotype is a pair of alleles, one on each chromosome of a chromosome pair. When there are three alleles A, B, and C, the possible genotypes are AA, BB, CC, AB, AC, and BC. The genotypes AA, BB, and CC are homozygous while AB, AC, and BC are heterozygous.
If a population has three alleles A, B, and C for a particular gene, and their respective frequencies are p, q, and r, the key question a biologist would ask is if the corresponding genotypes are in Hardy-Weinberg equilibrium.
Hardy-Weinberg equilibrium is a mathematical state that represents the theoretical genotype frequencies when mating is purely random and there is no mutation, non-random mating patterns (assortative and disassortative mating), selection, or gene flow. If the actual genotype frequencies deviate significantly from the theoretical Hardy-Weinberg frequencies, mating is most likely not random. In this case, population biologists would look for evidence of gene mutation, non-random mating patterns, etc.
Hardy-Weinberg Frequencies for Three Alleles and Punnett Square
Suppose the relative frequencies of the alleles A, B, and C are p, q, and r respectively. Since the proportions must add up to the whole, we know that p + q + r = 1. The theoretical relative frequencies of the genotypes AA, BB, CC, AB, AC, and BC are
theo.freq. AA = p²
theo.freq. BB = q²
theo.freq. CC = r²
theo.freq. AB = 2pq
theo.freq. AC = 2pr
theo.freq. BC = 2qr
These proportions come from the expansion of the product
(p + q + r)² = p² + q² + r² + 2pq + 2pr + 2qr
And since p + q + r = 1, we also have (p + q + r)² = 1, thus the theoretical genotype frequencies sum to the whole as well. The theoretical frequencies can be visualized with a Punnett square for three alleles.
Notice how the homozygous genotypes are along the diagonal and the heterozygous genotypes are off-diagonal. Since there are two ways to obtain each heterozygous pair, their frequencies have a coefficient of 2. For example, the genotype AB can occur when the offspring receives allele A from the mother and allele B from the father or vice versa. Thus the expected frequency is pq + pq = 2pq.
Recovering the Allele Frequencies from the Observed Genotype Frequencies
When studying the genetics of a population, the frequencies of the various alleles for a particular gene are not known before hand. They must be computed from the observed counts of each genotype. For example, suppose you are studying a population of 120 organisms and the number of each genotype is
obs.num. AA = 49
obs.num. BB = 13
obs.num. CC = 14
obs.num. AB = 10
obs.num. AC = 20
obs.num. BC = 14
Since each individual hold two alleles, the total number of alleles is 240. The number of copies of allele A is twice the number of homozygous AA individuals plus the number of heterozygous AB and AC individuals. Working it out, you get 2*49 + 10 + 20 = 128. The frequency of allele A is this sum divided by 240, that is 128/240 = 0.53333.
Similarly, the frequency of allele B is (2*13 + 10 + 14)/240 = 0.20833. And finally, the frequency of allele C is (2*14 + 20 + 14)/240 = 0.25833.
Thus, you have p = 0.53333, q = 0.20833, and r = 0.25833. You can check that the three frequencies add up to 1 (with some error due to rounding).
Testing If a Population Is in Hardy-Weinberg Equilibrium
Once you have recovered the relative frequencies of each allele from the observed genotype frequencies, you can plug them into the formulas for the theoretical H-W equilibrium genotype frequencies. Then you multiply each of the six by the size of the population to obtain the hypothetical number of individuals for each genotype were the population in Hardy-Weinberg equilibrium
theo.freq. AA = p² = 0.53333*0.53333 = 0.28444
theo.freq. BB = q² = 0.20833*0.20833 = 0.04340
theo.freq. CC = r² = 0.25833*0.25833 = 0.06674
theo.freq. AB = 2pq = 2*0.53333*0.20833 = 0.22222
theo.freq. AC = 2pr = 2*0.53333*0.25833 = 0.27556
theo.freq. BC = 2qr = 2*0.20833*0.25833 = 0.10764
Now if we multiply each of these proportions by 120, we get the number of individuals the population would have under H-W equilibrium
theo.num. AA = 0.28444*120 = 34
theo.num. BB = 0.04340*120 = 5
theo.num. CC = 0.6674*120 = 8
theo.num. AB = 0.22222*120 = 27
theo.num. AC = 0.27556*120 = 33
theo.num. BC = 0.10764*120 = 13
As you can see just from eyeballing the two sets of genotype distributions, the theoretical Hardy-Weinberg values deviate significantly from the actual observed values. In particular, the observations show many more homozygous individuals than would be expected under random mating conditions. There are 59 AA individuals when only 34 are expected, 13 BB when only 5 are expected, and 14 CC when only 8 are expected. This indicates that assortative mating may be occurring, that is, individuals prefer genetically similar mates.
Deviation is from expected norms is measured mathematically using a chi-square test. When applying the chi-square test to Hardy-Weinberg problems, the number of degrees of freedom is equal to the number of genotypes minus the number of alleles. So in this case there are 6 - 3 = 3 degrees of freedom.