Paul A. Gagniuc

Algorithms in Bioinformatics


Скачать книгу

deltocephalinicola with a genome of 112 kbp (0.11 Mbp) [172, 173]. The eukaryotes with the smallest nuclear genome necessary for life are found in the kingdom of fungi. The spore-forming unicellular parasite Encephalitozoon intestinalis shows a genome size of ∼2.3 Mbp and a total of 1.8k protein-coding genes [174]. Nonetheless, the smallest free-living eukaryote is Ostreococcus tauri, a marine green alga with a diameter of about 0.8 μm and a genome size of 12.6 Mbp (8.2k protein-coding genes) [175].

      2.3.1 Alternative Methods

      2.3.2 The Weaving of Scales

      To get a sense of genome size closer to our reference system, some transformations can express the mega base pairs as physical lengths. The linear length of a double-stranded DNA (dsDNA) molecule can be calculated by multiplying the average distance between bases (∼3.4 angstrom = 0.34 nm [179, 180]; 1 angstrom = 0.1 nm) by the total number of base pairs in a genome. Here, genomes are expressed in mega base pairs. Since 1Mbp is equal to one million base pairs, the size of a genome can be multiplied by one million and then multiplied further by the average distance between bases (0.34 nm). One meter is equal to 1 000 000 000 nanometers (1 × 109). Thus, the result expressed in nanometers is divided by 1 × 109 for conversion to meters.

equation

      Depending on the organism, cells of different tissues can be characterized based on the number of sets of chromosomes present: monoploid (one set of chromosomes), diploid (two sets), triploid (three sets), tetraploid (four sets), pentaploid (five sets), and so on. For instance, the human genome contains 3.1 Gbp (3100 Mbp). Thus, in a human haploid (or monoploid) cell (e.g. a single set of chromosomes found in a gamete), the unfolded length of a single set of chromosomes, arranged linearly one after the other, would show an approximate length of:

equation equation

      Additional algorithm 2.1 Note that the source code is in context and works with copy/paste.

      <script> document.write('Homo sapiens (3100 Mb): <br>'); document.write('DNA in a haploid cell nucleus: '); document.write(f(3100) + ' meters<br>'); document.write('DNA in a somatic cell nucleus: '); document.write((2 * f(3100)) + ' meters<br>'); function f(Mb){return (0.34 * 1000000 * Mb)/1000000000;} </script>Output: Homo sapiens (3100 Mb): DNA in a haploid cell nucleus: 1.054 meters DNA in a somatic cell nucleus: 2.108 meters

      Additional algorithm 2.2 Note that the source code is in context and works with copy/paste.

      <script> // DNA to meters var a = 'Ambystoma mexicanum|32396Mb' + 'Pinus lambertiana|27603Mb' + 'Sequoia sempervirens|26537Mb' + 'Minicystis rosea|16Mb' + 'Sorangium cellulosum So0157-2|14.78Mb' + 'Escherichia coli|4.9Mb' + 'Encephalitozoon intestinalis|2.3Mb' + 'Ostreococcus tauri|12.6Mb' + 'Homo sapiens|3100Mb'; var t = a.split('Mb'); for (var u=0; u<t.length-1; u ++) { var r = t[u].split('|'); document.write(r[0] + ' (' + r[1] + ' Mb) = '); document.write(f(r[1]) + ' meters<br>'); } function f(Mb){return (0.34 * 1000000 * Mb)/1000000000;} </script> Output: Ambystoma mexicanum (32396 Mb) = 11.01464 meters Pinus lambertiana (27603 Mb) = 9.38502 meters Sequoia sempervirens (26537 Mb) = 9.02258 meters Minicystis rosea (16 Mb) = 0.00544 meters Sorangium cellulosum So0157-2 (14.78 Mb) = 0.0050252 meters Escherichia coli (4.9 Mb) = 0.0016660000000000002 meters Encephalitozoon intestinalis (2.3 Mb) = 0.0007819999999999999 meters Ostreococcus tauri (12.6 Mb) = 0.004284 meters Homo sapiens (3100 Mb) = 1.054 meters

      2.3.3 Computations on the Average Genome Size