In genetics, an allele is a variant of a gene. A genotype is a pair of alleles, one on each chromosome of a chromosome pair. When there are three alleles *A*, *B*, and *C*, the possible genotypes are *AA*, *BB*, *CC*, *AB*, *AC*, and *BC*. The genotypes *AA*, *BB*, and *CC* are homozygous while *AB*, *AC*, and *BC* are heterozygous.

If a population has three alleles *A*, *B*, and *C* for a particular gene, and their respective frequencies are *p*, *q*, and *r*, the key question a biologist would ask is if the corresponding genotypes are in Hardy-Weinberg equilibrium.

Hardy-Weinberg equilibrium is a mathematical state that represents the theoretical genotype frequencies when mating is purely random and there is no mutation, non-random mating patterns (assortative and disassortative mating), selection, or gene flow. If the actual genotype frequencies deviate significantly from the theoretical Hardy-Weinberg frequencies, mating is most likely not random. In this case, population biologists would look for evidence of gene mutation, non-random mating patterns, etc.

### Hardy-Weinberg Frequencies for Three Alleles and Punnett Square

Suppose the relative frequencies of the alleles *A*, *B*, and *C* are *p*, *q*, and *r* respectively. Since the proportions must add up to the whole, we know that *p + q + r* = 1. The theoretical relative frequencies of the genotypes *AA*, *BB*, *CC*, *AB*, *AC*, and *BC *are

theo.freq. *AA = p*²

theo.freq. *BB = q*²

theo.freq. *CC = r*²

theo.freq. *AB* = 2*pq*

theo.freq. *AC* = 2*pr*

theo.freq. *BC* = 2*qr*

These proportions come from the expansion of the product

(*p + q + r*)² = *p*² + *q*² + *r*² + 2*pq* + 2*pr* + 2*qr*

And since *p + q + r* = 1, we also have (*p + q + r*)² = 1, thus the theoretical genotype frequencies sum to the whole as well. The theoretical frequencies can be visualized with a Punnett square for three alleles.

Notice how the homozygous genotypes are along the diagonal and the heterozygous genotypes are off-diagonal. Since there are two ways to obtain each heterozygous pair, their frequencies have a coefficient of 2. For example, the genotype *AB* can occur when the offspring receives allele *A* from the mother and allele *B* from the father or vice versa. Thus the expected frequency is *pq + pq* = 2*pq*.

### Recovering the Allele Frequencies from the Observed Genotype Frequencies

When studying the genetics of a population, the frequencies of the various alleles for a particular gene are not known before hand. They must be computed from the observed counts of each genotype. For example, suppose you are studying a population of 120 organisms and the number of each genotype is

obs.num.* AA* = 49

obs.num. *BB* = 13

obs.num. *CC* = 14

obs.num. *AB* = 10

obs.num. *AC* = 20

obs.num. *BC* = 14

Since each individual hold two alleles, the total number of alleles is 240. The number of copies of allele *A* is twice the number of homozygous *AA* individuals plus the number of heterozygous *AB* and *AC* individuals. Working it out, you get 2*49 + 10 + 20 = 128. The frequency of allele *A* is this sum divided by 240, that is 128/240 = 0.53333.

Similarly, the frequency of allele *B* is (2*13 + 10 + 14)/240 = 0.20833. And finally, the frequency of allele *C* is (2*14 + 20 + 14)/240 = 0.25833.

Thus, you have *p* = 0.53333, *q* = 0.20833, and *r* = 0.25833. You can check that the three frequencies add up to 1 (with some error due to rounding).

### Testing If a Population Is in Hardy-Weinberg Equilibrium

Once you have recovered the relative frequencies of each allele from the observed genotype frequencies, you can plug them into the formulas for the theoretical H-W equilibrium genotype frequencies. Then you multiply each of the six by the size of the population to obtain the hypothetical number of individuals for each genotype were the population in Hardy-Weinberg equilibrium

theo.freq. *AA = p*² = 0.53333*0.53333 = 0.28444

theo.freq. *BB = q*² = 0.20833*0.20833 = 0.04340

theo.freq. *CC = r*² = 0.25833*0.25833 = 0.06674

theo.freq. *AB* = 2*pq* = 2*0.53333*0.20833 = 0.22222

theo.freq. *AC* = 2*pr* = 2*0.53333*0.25833 = 0.27556

theo.freq. *BC* = 2*qr* = 2*0.20833*0.25833 = 0.10764

Now if we multiply each of these proportions by 120, we get the number of individuals the population would have under H-W equilibrium

theo.num. *AA* = 0.28444*120 = 34

theo.num. *BB* = 0.04340*120 = 5

theo.num. *CC* = 0.6674*120 = 8

theo.num. *AB* = 0.22222*120 = 27

theo.num. *AC* = 0.27556*120 = 33

theo.num. *BC* = 0.10764*120 = 13

As you can see just from eyeballing the two sets of genotype distributions, the theoretical Hardy-Weinberg values deviate significantly from the actual observed values. In particular, the observations show many more homozygous individuals than would be expected under random mating conditions. There are 59 *AA* individuals when only 34 are expected, 13 *BB* when only 5 are expected, and 14 *CC* when only 8 are expected. This indicates that assortative mating may be occurring, that is, individuals prefer genetically similar mates.

Deviation is from expected norms is measured mathematically using a chi-square test. When applying the chi-square test to Hardy-Weinberg problems, the number of degrees of freedom is equal to the number of genotypes minus the number of alleles. So in this case there are 6 - 3 = 3 degrees of freedom.