Replicating And Repairing The Genome: From Basic Mechanisms To Modern Genetic Technologies
The structure of the polymerase domain resembles a partially closed right hand, with analogs of a palm, finger region, and thumb (Figure 2.1A). The active site for the polymerase reaction is within the palm subdomain, and the DNA (blue in Figure 2.1A) passes through the enzyme under the grip of the thumb. This right-hand architecture is common to nearly all DNA polymerases, even those belonging to different families.
The host protein thioredoxin binds to T7 DNA polymerase on an extension of the polymerase thumb (Figure 2.1A; thioredoxin in red). As will be discussed below, this binding enhances the activity of polymerase, allowing it to replicate a much longer stretch of DNA. As you might know, thioredoxin is a coenzyme involved in redox reactions in the cell, but this redox function of thioredoxin is not needed for T7 DNA replication. Instead, thioredoxin acts as a structural protein in its complex with T7 DNA polymerase, increasing polymerase processivity and configuring the polymerase to bind the DNA helicase (see below).1
Figure 2.1.X-ray crystallographic structures of the three major replication proteins of bacteriophage T7. Like other DNA polymerases, the structure of T7 DNA polymerase (yellow) resembles a right hand, with palm, finger, and thumb domains (panel A). Appended to the thumb is a binding domain for the host thioredoxin protein (red); the primer-template DNA is in blue. These and many other structures throughout the book are from the RCSB Protein Database (www.rcsb.org; Berman et al., 2000). Structure (i) is shown in the spacefill format and structure (ii) in the cartoon format, which represents α-helices as helices made from flat ribbons and β-sheets as flat ribbons that terminate with an arrow. Both images are PDB structure 1SKR (Li et al., 2004). The T7 helicase (panel B) crystallized in several different forms. The image shown in structure (i) (spacefill) and (ii) (cartoon) is a hexamer of a truncated (helicase-only) form, with alternating subunits shown in different shades of blue. The bound nucleotide is shown in red in structure (ii), highlighting the active sites for nucleotide hydrolysis between adjacent subunits. These two images are from the RCSB PDB (www.rcsb.org) of PDB ID 1EOJ (Singleton et al., 2000). Full-length helicase/primase without DNA crystallizes as a heptamer. The heptameric forms in structure (iii) and (iv) (both spacefill, 90° rotation between the two) are from the RCSB PDB (www.rcsb.org) of PDB ID 1Q57 (Toth et al., 2003). Other closely related forms will be discussed later in this chapter. The T7 ssDNA-binding protein (yellow with α-helices in blue) is composed of an OB fold and two short α helices (panel C). This protein structure (PDB 1JE5) image is reproduced from Kulczyk and Richardson (2016), with permission from Elsevier; permission conveyed by Copyright Clearance Center, Inc. The location of the DNA-binding cleft is indicated based on the structure of a complex between an evolutionarily related ssDNA binding protein and DNA (Cernooka et al., 2017).
When we counted only four different proteins in T7 DNA replication, we stretched the truth just a bit. It turns out that the gene encoding the T7 helicase/primase protein actually makes two closely related proteins in roughly equal amounts. One of these is missing the N-terminal 63 amino acids, but otherwise consists of the same exact amino acid sequence as the C-terminal remainder of the larger protein. These two forms result from the use of two different translation initiation codons during translation (in the same reading frame). As will be discussed below, the functional form of the helicase/primase during replication is a hexamer (6-mer). Some evidence suggests that the functional hexamer has three subunits of each of the two protein species in vivo; however, this stoichiometry is not required for assembly of helicase complexes in vitro.
The helicase function of the helicase/primase protein is carried out by the C-terminal segment, while the primase function resides in the N-terminal region. The smaller form of the protein discussed above is missing a portion of the primase region and, correspondingly, lacks primase activity.
X-ray crystallography has revealed several related structures of full-length or truncated versions of the T7 helicase/primase. A truncated form with only the helicase domain was found to form a beautiful hexameric ring or donut structure, with a hole in the middle large enough to accommodate ssDNA (Figure 2.1B, images i and ii). The full-length helicase/primase in the absence of DNA forms heptamers (7-mers), and a crystal structure of the heptamer again reveals a ring with a hole in the middle (Figure 2.1B, images iii and iv). Image iii shows the helicase domains of the heptamer, looking down on the protein from the C-terminal side. Rotating the complex into the plane of the paper by 90°, the primase domains are seen to extend downwards off the helicase heptamer (image iv). As mentioned above, the native complex during bacteriophage T7 infections may well contain three each of the full-length and N-terminally truncated subunits, but no structure yet exists for this hybrid form of the complex. We will return to structural variations in the T7 helicase/primase complex when we discuss the mechanism of helicase unwinding and the function of the replisome complex.
The fourth protein is the T7 ssDNA-binding protein, a 232-amino acid protein with a fairly simple structure (Figure 2.1C). The central part of the protein consists of a β-sheet with five strands oriented in a characteristic pattern called an OB-fold (oligosaccharide/oligonucleotide-binding fold). As the fold name implies, this is a common structural motif that a variety of proteins use for binding ssDNA, which occupies a cleft with the OB-fold at the base (Figure 2.1C). The C-terminal region of T7 ssDNA-binding protein, which is quite acidic, extends away from the body of the protein and is known to interact with the other T7 replication proteins (see below).
2.3Activities of T7 DNA polymerase in the replisome
Many features of T7 DNA polymerase are conserved in other replicative DNA polymerases, making the enzyme a good model. The enzyme tightly grips the primer-template using all three subdomains (palm, finger, and thumb; Figure 2.1A). As introduced in Chapter 1, nearly all DNA polymerases require a primer for extension, and the single-stranded template is used to determine which of the four DNA nucleotides will be added to the end of the primer by base pairing (Figure 1.4). The active site of the enzyme, where the new nucleotide is added, is within the palm subdomain near the base of the finger subdomain (Figure 2.1A). The active site region is tightly organized to strongly favor the correct base pairing of the incoming nucleotide and disfavor the three possible mispairs, contributing to the high fidelity of the enzyme.
The structure and geometry of the active site of the polymerase are very adept at aligning the incoming nucleotide for proper base pairing and excluding the three possible incorrect nucleotides (which don’t fit the active site region very well). Detailed biochemical and structural studies have provided beautiful insights into the details of this base selection process, which is beyond the scope of this chapter. The fidelity of the polymerase step is very high but not perfect, and an incorrect base is inserted roughly once every 20,000 or so incorporation cycles. Incorrect base insertion is probably impossible to avoid. For example, tautomer forms of the DNA bases exist transiently at a very low level and these can pair with the “wrong” template base and thereby fit reasonably well within the polymerase active site. Without some additional mechanism to correct these misinsertions, mutation rates would be very high and complex cells may never have evolved.
The T7 DNA polymerase illustrates one of the major correction mechanisms/pathways that explain the extremely low mutation rate seen in genome replication (see Chapter 1). The exonuclease domain of the polymerase, introduced above, plays the major function of correcting most of the