multiple and versatile roles across all biological systems and one of the roles is mRNA silencing and post-transcriptional regulation of gene expression. Small RNAs are short (∼18–30 nucleotides), noncoding RNA molecules that can regulate gene expression in both the cytoplasm and the nucleus. A few classes of small RNAs have been defined, such as microRNAs (miRNAs), small interfering RNAs (siRNAs), and Piwi-interacting RNAs (piRNAs) [63]. For instance, miRNAs are small noncoding RNA molecules (∼21–25 nucleotides in length) that play an important regulatory role in animals and plants by targeting specific mRNAs for degradation or translation repression [64, 65]. It appears that an imperfect complementary between miRNAs and different mRNA targets has the potential to regulate several genes simultaneously. Moreover, miRNAs cross the boundary of a single cell. To add to the complexity of these processes, some miRNAs are secreted into exosomes or microvesicles and may have the ability to move through circulation to other distant cells or tissues [66–68]. Without question, the fine-grained regulation that underlies the complexity of eukaryotes is found in these short RNA molecules.
1.5.8 The Transcriptome
The set of all RNA molecules produced by a given organism is known as the “transcriptome.” This includes, of course, the mRNA transcripts but also the RNA molecules mentioned above (i.e. mRNAs, tRNAs, rRNAs, siRNAs, miRNAs, piRNAs, and so on) as well as other uncharacterized noncoding RNA molecules. When expressed, genes produce mRNAs in different quantities, which are then detectable [69]. Currently, two main techniques are representative for capturing gene expression, namely: RNA-Seq and microarrays [70, 71]. RNA-Seq (RNA sequencing) allows for full sequencing of all RNA molecules present in a sample, whereas microarrays target known transcripts of different genes through hybridization (complementary) [72]. Thus, RNA-Seq experiments can estimate the subset of genes expressed in a cell type or in different tissues (several cell types) at any one time by an alignment of the sequenced RNAs to the reference genome (the DNA of the organism) [73]. However, the transcriptome can be seen as an ideal set, because the complete set of possible RNAs cannot be fully detected. Reasoning dictates that each state of a cell shows a specific subset of RNAs from the transcriptome. Of the total number of states that a cell can exhibit, only a few states can be induced and captured by RNA-Seq. Thus, a small subset of RNAs from the transcriptome may remain undetectable. At the tissue level, there are a number of cell types, each with a specific set of active genes. Often, the analysis of the pattern of gene expression is performed at the tissue level, i.e. on several cell types at the same time. From a global perspective, this leads to a union between the sets of genes expressed in each of the cell types that make up the tissue. Furthermore, genes that are expressed in several cell types (such as housekeeping genes) may show the highest amounts of mRNA, while genes that are only expressed in certain cell types can show lower amounts of mRNA.
1.5.9 Gene Networks and Information Processing
The mRNA and/or the protein products encoded by one gene often regulate the expression of other genes. In multicellular eukaryotes, the set of genes that are expressed in a specific cell type forms an “open” gene network. Each gene network is a self-orchestrated feedback loop constantly adapting to different inputs from the environment. The dynamics of a gene network may be deduced in practice from the gene expression levels. The RNA-Seq technique shows the set of genes, and their expression levels (amount of mRNA) at the time of cell/tissue sampling. Repeated sampling at different time intervals can complete a puzzle related to the functional relationship between the genes of the set. Direct or indirect activation of a gene promoter by the product of other genes (mRNA or proteins) is done with a relative delay and largely depends on the frequency by which the gene product is synthesized. The frequency of synthesis impacts the time of accumulation of the gene product (mRNA or proteins) in the cell as well as its stochastic diffusion toward other promoters and macromolecules with which it can interact. Note that the environment can be represented by a number of factors: the current set of molecules inside the cell, the signal molecules synthesized by other cells (other gene networks) or the amount of nutrients, pressure, temperature, and so on.
1.5.10 Eukaryotic vs. Prokaryotic Regulation
In prokaryotes, gene expression is primarily regulated at the level of transcription. Moreover, transcription and translation occur almost simultaneously in the cell cytoplasm. Eukaryotic regulation of gene expression is dynamically orchestrated at several levels, such as epigenetics (chromatin and TF), transcription, post-transcription, translation, and post-translation (further processing of the amino acid polypeptides from a primary structure to more complex, secondary, tertiary structure, and so on). Eukaryotic gene expression occurs with a delay when compared to prokaryotes, as transcription takes place within the nucleus and translation occurs outside the nucleus within the cytoplasm.
1.5.11 What Is Life?
The information in DNA molecules supports a continuous biochemical feedback inside the cell, which self-regulates according to external and internal stimuli (i.e. nutrients, signal molecules, pressure, electromagnetic radiation, and so on). Through energy consumption, this continuous process is maintained in a permanent imbalance above the “inanimate” background. From our reference system, this dynamic self-regulating biochemical feedback is considered life.
1.6 Known Species
How many species really? Unfortunately, life is not as diverse as previously believed [74, 75]. For more than 250 years, our species has cataloged all the other land and water species at our disposal, and continue to do so. Based on this census, the tree of life contains a total of 1.43 million known species, from which almost 1.42 million are eukaryotes and 12k species are prokaryotes (Table 1.1 and Figure 1.1) [74]. Most of the time, real census ruins the “feng shui” of predictions. In total, even the most enthusiastic predictions forecast between 8 million and 11 million species in existence [74, 76, 77].
However, under the heavy umbrella of uncertainty, predictions and census rarely match when it comes to the total number of species on earth. Among the cataloged species for the tree of life, animal species constitute 78% and plant species represent 15%. Out of a total of 1.4 million species, about 1.2 million species live on land and 19k live in the aquatic environment (Table 1.1).
Table 1.1 The total number of known species.
Source: Refs. [74, 283].
Kingdoms | Land | Water | Total |
---|---|---|---|
Eukaryotes | |||
Animals | 953k | 171k | 1125k |
Fungi | 43k | 1k | 44k |
Plants | 216k | 9k | 224k |
Protists | 21k | 13k | 34k |
Prokaryotes | |||
Bacteria | 10k | 1k | 11k |
Archaea | 1k |