different chemical structure of the side groups (in blue). Also shown in brackets is the three-letter designation of each amino acid.
Source: Reproduced with permission of wikicommons.
Although there are various amino acids in nature, with over 500 known, only 20 of these compounds are commonly used in life (Figure 4.3), with two others more rarely used (selenocysteine and pyrrolysine). Life therefore uses a very select number. We don't really know why this is the case. It might be like asking why someone building a house doesn't use all the wonderful variety of bricks that are available from their local garden store. It makes no sense to use all of them because that would result in incompatibility between brick types and too much complexity to get the job done. If 20 amino acids allow for a life form to come into existence and reproduce, then there is no evolutionary selection pressure to use more amino acids. Another reason could be that they are the best of all possible amino acids that could have been used by life. Their biochemical characteristics may have favored the use of these particular 20 amino acids in the earliest types of cells. You might like to read the following Discussion Point, which investigates this idea further.
Discussion Point: Why Does Life Use the 20 Particular Amino Acids it Has?
There are hundreds of amino acids in nature, but life usually only uses 20; on rare occasions, it uses two additional ones. Is this a chance outcome of evolutionary processes that could have picked many other permutations? This question has been investigated by a number of researchers. In one study, 50 amino acids naturally found in meteorites were taken (assuming that meteorites could have been one plausible source of amino acids when life first emerged), and using a computer program, a selection of amino acids were sampled from this set of 50 that have a broad coverage of several important characteristics for building a protein: (i) a range of sizes, (ii) different charges, and (iii) a range of hydrophobicities (a tendency to repel water). Remarkably, when you run a program to select the group of amino acids that has the best coverage across these characteristics, our own selection of 20 amino acids is the best in a million different possible combinations. Other permutations and combinations of amino acids also show that the terrestrial set is unusual in its use as a toolbox of amino acids for building proteins. You might like to investigate the literature at the end of this discussion point further and explore for yourself how the amino acids for life emerged and what might have been the selection pressures that resulted in the amino acids found in proteins. How much chance do you think there was in this selection? If life emerged on another planet and uses proteins, do you think it would end up using the same amino acids?
Freeland, S.J. and Hurst, L.D. (1998). The genetic code is one in a million. Journal of Molecular Evolution 47: 238–248.
Freeland, S.J., Knight, R.D., Landweber, L.F. et al. (2000). Early fixation of an optimal genetic code. Molecular Biology and Evolution 17: 511–518.
Philip, G.K. and Freeland, S.J. (2011). Did evolution select a nonrandom “alphabet” of amino acids? Astrobiology 11: 235–240.
In Figure 4.3 you can see the variety of amino acids created by changing the −R side group. As well as being designated by their full names, amino acids are also given a three-letter code (Figure 4.3) and a one-letter code (Appendix A.6). The −R groups confer upon amino acids different properties. Some are polar, which means they like to dissolve in water on account of their ability to take part in hydrogen bonding. They include serine, asparagine, and histidine. Some are hydrophobic (they repel water), such as alanine, valine, and leucine. At the pH of the cell, some are positively charged (such as lysine and histidine) and some are negatively charged (such as aspartate and glutamate) and can therefore form ionic links as part of salt bridges (see Chapter 3). These different properties alter the way in which amino acids along a protein chain interact with one another and thus how a long chain of amino acids will fold together to make a three-dimensional molecule.
Although Figure 4.3 shows correctly the general structure of amino acids, their behavior in the cell is subtler. At cellular pH (near to 7), the amino group will tend to attract a proton from the carboxyl group so that it has a net positive charge, and the carboxyl group has a net negative charge (Figure 4.4). The overall molecule has zero charge, but there is an uneven charge distribution on the molecule. Such a molecule is called a zwitterion, and amino acids are an important example of such molecules. In reality, things are even more complicated than this, since the amino acid will also interact with water molecules in the cell. The proton in the amino group will tend to be donated to the water. Thus, the interplay between water molecules, the local pH, and the amino acid subtly modulates the chemical behavior of amino acids. This is also true of the −R group, whose charge and chemical interactions are modified by the surrounding pH. These intricacies are of enormous importance in modifying the biochemical behavior and function of proteins.
Figure 4.4 Amino acids are zwitterions. At cellular pH, they have the structure above.
Amino acids are assembled together in chains through peptide bonds, whereby the amino and carboxyl group of two amino acids react to form a bond with the release of a water molecule in a dehydration reaction (Figure 4.5). This process of polymerization can go on until proteins are built that contain many hundreds of amino acids. The result of this linking together of amino acids is a protein chain, sometimes called a polypeptide. The sequence of amino acids that makes up the chain is called the primary sequence. This sequence of amino acids could be written down from either end of the chain, which would cause confusion. By convention, the sequence is reported from the end that has the dangling amino group (the N-terminus) through to the other end with the dangling carboxyl group (the C-terminus).
Figure 4.5 The formation of a peptide bond between two amino acids. This dehydration reaction (involving the release of a water molecule) allows for the assembly of polypeptide protein chains.
The exact sequence of the amino acids determines what the protein will do in the cell. An obvious question to ask is how this long chain of amino acids is turned into something useful.
Some of the charged amino acids bind with one another from different places on the chain to form ionic bonds (e.g. the positively charged aspartic acid binds ionically to the negatively charged lysine), as we saw in Chapter 3. Some amino acids form covalent bonds, for example two cysteine amino acids that contain sulfur form a disulfide bridge, as we also discussed in Chapter 3. Thus, the primary sequence comes together to form hairpins, helices, and other structures which are referred to as the secondary structure. The complete atomic arrangement within a whole protein is called the tertiary structure. It is this three-dimensional structure that can now do useful biological work. Sometimes individual proteins come together to make an even larger protein. These protein subunits form a multimeric structure, and we refer to this arrangement as the quaternary structure.
In enzymes, the three-dimensional arrangement of amino acids evolves to facilitate the attachment of substrates to carry out a reaction and then the release of the products, carrying out this reaction many times sequentially. The site within a protein where the amino acids are configured in such a way that their side groups can bind reactants and catalyze a chemical reaction is called the active site.
4.6 Chirality
An important feature of amino acids is that