Группа авторов

Genome Editing in Drug Discovery


Скачать книгу

is the basis for CRISPR classification (Makarova et al. 2019). Furthermore, many genes not evolutionary related to cas genes can also be found within the CRISPR loci. These genes often encode for proteins providing ancillary functions, or are evolutionary relicts and do not have a role attributed to them yet, and some loci also encode for noncoding RNA. Furthermore, there might be multiple CRISPR loci within a single genome: E. coli has four loci (Touchon and Rocha 2010) and Methanocaldococcus jannaschii has up to 18 (Lillestol et al. 2006), but rarely more than one or two are active simultaneously (Horvath et al. 2008).

      All the components of CRISPR loci function together to provide an adaptive immune response. The CRISPR immune response can be divided into three phases (Figure 3.1):

      1 Adaptation (spacer acquisition, or immunization). On a rare occasion during infection by a bacteriophage or other mobile elements, suitable pieces of invaders genome can be snatched up by the microbe’s defense machinery and integrated into CRISPR array as a spacer by the adaptation machinery (Figure 3.1b). This acquisition of not previously encountered spacers is known as naïve adaptation. Once acquired, the spacer acts as a heritable record of immunization and will restrict mobile elements which have the exact or similar sequence through the next stages of the immune response.

      2 Expression (crRNA biogenesis). Once acquired, the spacer can be utilized to fight future infections (Figure 3.1c). The whole CRISPR array is transcribed from the promoter located within the leader sequence, producing a pre‐crRNA transcript. With the help of Cas proteins, other small RNA molecules, or the host’s machinery, pre‐crRNA is processed into mature crRNA, which are paired with the effector Cas proteins to form a functional effector complex that confers immunity.

      3 Interference (immunity) is maintained by the assembled Cas:crRNA effector complex. Invading genomes that carry sequence complementary (or partially complementary) to one of the spacers will be recognized by base pairing of the crRNA, and subsequently degraded by the nucleolytic activity of the associated Cas proteins, thus terminating the infection (Figure 3.1d).

      In some circumstances, the degraded nucleic acids can be captured by the adaptation complex and integrated as a new spacer into the CRISPR array, restarting the process. This primed adaption (Figure 3.1e), where an invading genome is neutralized by the previously acquired spacer and actively includes Cas systems, is several orders of magnitude more efficient than the naïve adaptation (Staals et al. 2016; Stringer et al. 2020), and acts as a magnificent example of adaptive immunity.

      One caveat of an immune system relying on nucleic acid base recognition is how to discriminate between the invading genome and endogenous sequences (for example, in the CRISPR array). Nearly all CRISPR systems have a discrimination mechanism where a short sequence adjacent to the target sequence must be recognized by the effector complex to efficiently bind to and then degrade the target sequence (Garneau et al. 2010; Sashital et al. 2012; Anders et al. 2014). Similarly, these species‐specific protospacer adjacent motifs (PAM) are recognized by the adaptation complex and are processed in such a way so they are not integrated into the CRISPR array (Datsenko et al. 2012; Wang et al. 2015; Rollie et al. 2018). The presence of PAM adjacent to the target sequence (collectively termed protospacer) and its absence from the CRISPR array ensure correct recognition of invading genomes as nonself and preventing cleavage of the host genome. It is important to note that the exact sequence of PAM required for the interference and adaptation stages vary dramatically between species (for example, for Streptococcus pyogenes, PAM is 5’‐NGG‐3’, while for Staphylococcus aureus, it is 5’‐NNGRRT‐3’, where N denotes any nucleotide and R is A or G) and different taxa of CRISPR systems (Mojica et al. 2009; Shah et al. 2013), and often can be fairly liberal (Leenay et al. 2016), allowing the immune response to be responsive even if the mutation arise within the PAM or protospacer.

      Thanks to the greater availability of prokaryotic genome sequences, bioinformatic analysis has revealed that CRISPR systems are extremely abundant in prokaryotes, with roughly 40% of bacterial and over 85% of archaeal species harboring these systems (Makarova et al. 2019). This diversity, conferred by the remarkable variety of the Cas protein sequences, gene composition, and architecture of the loci, underpins the differences in how each of the three phases of adaptive immunity is performed. CRISPR systems have not only evolved to use different types of nucleic acids (DNA, RNA, or both) as a substrate (Marraffini and Sontheimer 2008; Hale et al. 2009; Kazlauskiene et al. 2016), but also can target different modalities (i.e. single‐ or double‐stranded) (Ma et al. 2015; Strutt et al. 2018) and a wide spectrum of different genomic sequences, thanks to diverse PAM requirements (Mojica et al. 2009; Gasiunas et al. 2020).

      Importantly, the diversity of Cas systems across species also translates into how these systems can be used as a tool, where one can choose the most suitable CRISPR system for their target (DNA or RNA), a sequence of choice (by choosing a Cas protein with a pertinent PAM) or application (by choosing a Cas system with the desired outcome). To fully explore this untapped potential of the microbial CRISPR systems, significant efforts to establish a robust classification of CRISPR‐Cas systems have been made over the past decade. As there are no universally present cas genes that could act as an identifying trait, CRISPR classifications have been based on multiple factors, mainly on comparison of genomic loci organization and gene repertoires involved in a particular system. The most up‐to‐date classification is used in this chapter (Makarova et al. 2019).

      While the interference module is the prominent feature in the classification of CRISPR systems, each of the classes and types also differ mechanistically in the manners of crRNA biogenesis and acquisition of spacers. The traits of main CRISPR‐Cas