Paul A. Gagniuc

Algorithms in Bioinformatics


Скачать книгу

Saccharomyces cerevisiae (the budding yeast) is an ideal example for this hypothesis [158]. Yeast colonies growing on solid media show specific structural patterns in their three-dimensional multicellular organization. These structural patterns are specific to each yeast strain [159]. Moreover, variations in the multicellular organization appear to be dependent on the environmental parameters, such as the position of surrounding cells, nutrient gradients, temperature, and so on [160].

      1.13.4 Chimerism and Mosaicism

      Biological literature is probably the most sophisticated among all sciences and can be particularly overwhelming. An introduction was made to some important concepts that can provide an overview on living organisms, such as the emergence of life, classification, number of species, the origins of eukaryotic cells, the endosymbiosis theory, organelles, reductive evolution, the importance of HGT, and the main hypotheses regarding the origin of eukaryotic multicellularity. Among the biological concepts described here, some have wider implications. Examples of genome-less organelles, such as hydrogenosomes, or processes such as the HGT, question life as we understand it. Endosymbionts best explain the significance of the environment and also explain the distribution of life in a blurry, nonunitary context. In other words, endosymbiosis widens the threshold of life and shows how difficult it is to place a border between how much life resides inside or outside the cell. Moreover, the HGT appears to connect all the species on earth to a greater or lesser extent. Much evidence shows that some of these ancient processes (e.g. catalytic RNAs) are likely adding or subtracting innovative mechanisms for continuous adaptations among different species (if not all).

      2.1 Introduction

      An insight into the context of biological information is of utmost importance for different approaches in bioinformatics. The first part of the chapter discusses the units of measurement and explains the meaning of some notations used here. A few interesting unit conversions, with accompanying algorithms, are shown in addition to the subject. Next, eukaryotic and prokaryotic organisms with the largest/smallest genomes are presented in detail. Moreover, different computations performed for this chapter show the average genome size above the major kingdoms of life, including the average genome size of different organelles, plasmids, and viruses. Toward the end of the chapter, a comparative analysis is made between the average number of genes and the average number of proteins above the main kingdoms of life. This informative analysis highlights the frequency of a process called alternative splicing, which allows certain eukaryotic genes to encode for several types of proteins.

      There is no direct correlation between the genome size of a species and the complexity of its phenotype. In any case, the intellectual curiosity regarding the size of genomes still remains. Determination of genome size based on DNA sequencing data is one of the most accurate methods to date. To observe the lack of correlation between genome size and phenotype, upper-bound extremes can be considered here. As expected in an intuitive manner, eukaryotes show the largest genomes. In animals, the amphibian Ambystoma mexicanum (the Mexican Axolotl) shows the largest (sequenced) genome observed in nature to date. A. mexicanum shows a genome size of 32 396 Mbp (32 Gb) and a physical length that can reach up to 30 cm [166]. In plants, the record is held by Pinus lambertiana (27 603 Mbp) and Sequoia sempervirens (26 537 Mbp). P. lambertiana is the tallest and most massive pine tree [167, 168]. S. sempervirens species includes the tallest living trees on Earth (115.5 m in height or 379 ft) [169]. Among the prokaryotes, Minicystis rosea and Sorangium cellulosum So0157-2 show the largest genomes. The bacterial genome of M. rosea contains 16 Mbp of DNA (GC%: 69.1) and shows the maximum genome size found in prokaryotes [170]. Secondary to this species is the bacterial genome of S. cellulosum So0157-2, with 14.78 Mbp of DNA (GC%: 72.1) [171]. As discussed in the previous chapter, endosymbiosis challenges the notion of the smallest genome necessary for life. The smallest prokaryotic genomes were found in different