It is usually (but not always) the case that new functional genes develop through the doubling or duplication of genes in the progress of evolution. New genes can also be generated by combining domains or partial gene sequences. Horizontal gene transfer (which happened when bacteria became mitochondria) also helped to enlarge the eukaryotic genomes. In contrast, pseudogenes, which are nontranslatable copies of genes, show frameshifts, nonsense mutations, deletions, and insertions (see Section 4.1.4). Pseudogenes do not have any further function today. Pseudogenes can be divided into two groups: the first arose from gene duplication and the second from retroposons. In the second case, the genes were transcribed and processed and, following reverse translation in DNA, were inserted into a location in the genome. It is usually the case that these retropseudogenes have no introns, but frequently poly(A) tails, and unlike the pseudogenes they are not present in the vicinity of the original gene from which they arose. Surprisingly, nature can afford to reproduce this junk DNA in every generation, even though replication is an energy‐consuming process. Perhaps these DNA sections that today appear to be useless will become functional in a later evolutionary phase as molecular replacement parts.
When the duplicated DNA sequence lies beside the original gene, it is termed as tandem repeat. These tandem repeats are the starting point for further DNA amplifications, induced by uneven crossing‐over. Repetitive DNA is quantitatively important and can be divided into middle repetitive DNA (transposons and retroelements) and highly repetitive DNA. The latter class includes short nucleotide sequences, which are present in great numbers in chromosomes in a tandem‐type style. There are also further divisions into telomere, satellite, minisatellite, and microsatellite DNA.
Upon cesium chloride gradient centrifugation, the DNA of eukaryotes is separated, and two bands are often observed, the smaller of which contains satellite DNA. This satellite DNA is especially rich in repetitive sequences and prefers to be localized in the region of the centromeres. In insects and other arthropods, this satellite DNA is very homogeneous, meaning that their sequence elements are highly conserved. In vertebrates the repeated sequence units contain up to 1000 repetitions of satellite DNA, and it is significantly longer and more variable (length of over 200 bp); subelements such as GA5TGA can often be found in these elements. Through uneven crossing‐over, the variability of satellite DNA is about 10 times higher than with genes that only have a low copy number. Division and organization of the repetitive DNA elements in the centromere region are chromosome and type specific. It is assumed that the repetitive DNA at the centromere region is responsible for homologous chromosome recognition and the fact that they arrange themselves next to each other during meiosis.
In the actual satellite DNA of both plants and animals, elements are found that are repeated 5–50 times, each being 15–100 bp. The sequence elements can be attributed to the original sequence that was varied through point mutations. This repetitive DNA, each about 500–5000 nucleotides in length, is significantly shorter than the satellite DNA and is termed minisatellite or variable number tandem repeats (VNTRs). It exhibits a large variability in length in every locus, and a very high mutation rate is present as a result of uneven crossing‐over (as the number and length of repeats is changed), which can amount to 5% of the gamete. Minisatellite DNA is therefore termed the hot spot of meiotic recombination. Minisatellite DNA is especially suitable for the identification of individuals and has been used also for clarification of paternity and homozygosity in a population. Many VNTR loci each have dozens of alleles, which are codominantly inherited. This characteristic was used in DNA fingerprinting. The possibility that two unrelated individuals have the same DNA fingerprints is less than 1 in 10 million. Presently, DNA fingerprinting is based on short tandem repeat (STR) and single nucleotide polymorphism (SNP) analyses.
In addition, there are still shorter repeats that arise in animal and plant genomes. These consist of a basic unit of two (sometimes as many as five) nucleotides, such as (GC)n or (CA)n, which are repeated up to 100 times. Of these elements, termed microsatellites or STRs (short tandem repeats), about 30 000 loci are found in humans, which are of great importance for the recognition of tissues and individuals, paternity and population studies, and genome mapping. STR analysis is the method of choice for the determination of sexual crimes or murder in forensic medicine or criminal studies. The alleles allow amplification through polymerase chain reaction(PCR) (see Chapter 13). Microsatellite PCR is currently the method of choice for many forensic, biotechnological, and biological investigations due to the fact that it requires only the smallest amounts of DNA. The variability of microsatellite DNA is strongly increased during meiosis via uneven crossing‐over and slippage of the DNA polymerase, so that the short sequence elements can be mutated, duplicated, and deleted. Alternatively to STR analyses, SNP analyses have become available for a number of organisms, which often provide a more detailed picture of the genetic background.
Additional 500‐base long DNA sections are found in animal and plant genomes. These so‐called scattered or short interspersed elements (SINEs), or 1000‐ to 5000‐nucleotide long interspersed elements (LINEs), appear in high copy numbers (although not in tandem style repeats) (Figure 4.2). The DNA elements Alu (which is recognized by the restriction enzyme AluI), Kpn, and poly(CA) are also counted among the SINEs. The percentage of these elements in the human genome is about 20% of the entire genome. It is presumed that these elements, which are also called mobile genetic elements or retrotransposons, arise through reverse transcription. From an evolutionary point of view, transposons (with long terminal repeats [LTRs]