key: cord-0036538-7oej1196 authors: Saitou, Naruya title: Neutral Evolution date: 2013-08-22 journal: Introduction to Evolutionary Genomics DOI: 10.1007/978-1-4471-5304-7_4 sha: 9aff611354ea9bf1a869229dd463c3998a75e554 doc_id: 36538 cord_uid: 7oej1196 Neutral evolution is the default process of the genome changes. This is because our world is finite and the randomness is important when we consider history of a finite world. The random nature of DNA propagation is discussed using branching process, coalescent process, Markov process, and diffusion process. Expected evolutionary patterns under neutrality are then discussed on fixation probability, rate of evolution, and amount of DNA variation kept in population. We then discuss various features of neutral evolution starting from evolutionary rates, synonymous and nonsynonymous substitutions, junk DNA, and pseudogenes. It is now established that the majority of mutations fixed during evolution are selectively neutral, as amply demonstrated by Kimura (1983; [1] ) and by Nei (1987; [2] ). Reports of many genome sequencing projects routinely mention neutral evolution in the twenty-first century, e.g., mouse genome paper in 2002 ( [3] ) and chicken genome in 2004 ( [4] ). We thus discuss neutral evolution as one of the basic processes of genome evolution in this chapter. Neutral evolution is characterized by the egalitarian nature of the propagation of selectively neutral mutants. For example, let us consider a bacterial plaque that is clonally formed. All cells in one plaque are homogeneous, or have the identical genome sequences, if there are no mutations during the formation of that plaque. Because of identicalness in genome sequences, there will be no change of genetic structure for this plaque. Let us assume that three cells at time 0 in Fig. 3 .2 are in this clonal plaque. Their descendant cells at time 4 also have the same genome sequences, though the number of offspring cells at that time varies from 0 to 4. This variation is attributed to nongenetic factors, such as heterogeneous distribution of nutrients. However, the most significant and fundamental factor is randomness, as we will see in Sect. 4 Mutation is the ultimate source of diversity of organisms. If a mutation occurring in some gene modifies gene function, there is a possibility of heterogeneity in terms of number of offsprings. This is the start of natural selection that will be discussed in Chap. 5. However, some mutations may not change gene function, and although they are somewhat different from parental type DNA sequences, mutants and parental or wild types are equal in terms of offspring propagation. We meet the egalitarian characteristic of the selectively neutral mutants. If all members of evolutionary units, such as DNA molecules, cells, individuals, or populations, are all equal, the frequency change of these types is dominated by random events. It is therefore logical that randomness is the most important factor in neutral evolution. Randomness also comes in when abiotic phenomena are involved in organismal evolution. Earthquakes, volcanic eruptions, continental drifts, meteorite hits, and many other geological and astronomical events are not the outcome of biotic evolution, and they can be considered to be stochastic from organismal point of view. Before proposal of the neutral theory of evolution in 1968 by Kimura ([5] ), randomness was not considered as the basic process of evolution. Systematic pressure, particularly natural selection, was believed to play the major role in evolution. This view is applicable if the population size, or the number of individuals in one population, is effectively infinite. However, the earth is finite, and the number of individuals is always finite. Even this whole universe is finite. This finiteness is the basis of the random nature of neutral evolution as we will see in later sections of this chapter. Nucleotide sequences reside genetic information, and one gene is often treated as a unit of evolution in many molecular evolutionary studies. A cell is the basic building block for all organisms except for viruses. It is thus natural to consider cell as unit of evolution. One cell is equivalent to one individual in single-cell organisms. In multicellular organisms, by definition, one individual is composed of many cells, and a single cell is no longer a unit of evolution. However, if we consider only germ-line cells and ignore somatic cells, we can still discuss cell lineages as the mainstream of multicellular organisms as in the case for single-cell organisms. Alternatively, clonal cells of one single-cell organism can be considered to be one individual. Cellular slime mold cells form a single body with many cells, or each cell may stay independently, depending on environmental conditions [6] . We therefore should be careful to define cell or individual. Organisms are usually living together, and multiple individuals form one "population." We humans are sexually reproducing, and it seems obvious for us to consider one mating group. In classic population genetics theory, this reproduction unit is called Mendelian population, after Gregor Johann Mendel, father of genetics. From individual point of view, the largest Mendelian population is its species. Asexually reproducing organisms are not necessary to form a population, and multiple individuals observed in proximity, which are often recognized as one population, may be just an outcome of past life history of the organism, and each individual may reproduce independently. Gene exchanges also occur in asexually reproducing organisms, including bacteria. Therefore, by extending species concept, bacterial cells with similar phylogenetic relationship are called species. Population or species is also defined for viruses, where each virus particle is assumed as one "individual." However, we have to be careful to define individuals and populations. One tree, such as cherry tree, is usually considered to be one individual, for it starts from one seed. Unlike most animal organisms, trees or many plant species can use part of their body to start new "individual." This asexual reproduction prompted plant population biologist John L. Harper to create terms genet (genetic individual) and ramet (physiological individual) [7] . We should thus be careful about the number of "individuals" especially for asexually reproducing organisms. We discuss the four major processes to mathematically describe the random characteristics of DNA transmission. The first two, branching process and coalescent process, are considering the genealogical relationship of gene copies, while the latter two, Markov process and diffusion process, treat temporal changes of allele frequencies. For organisms to evolve and diverge, we need changes, or mutation. Supply of mutations to the continuous flow of self-replication of genetic materials (DNA or RNA) is fundamental for organismal evolution. This process is most faithfully described in phylogenetic relationship of genes. Because every organism is product of eons of evolution, we are unable to grasp full characteristics of living beings without understanding the evolutionary history of genes and organisms. It is thus clear that reconstruction of phylogeny of genes is essential not only for study of evolution but also for biology in general. In another words, gene genealogy is the basic descriptor of evolution. It should be emphasized that the genealogical relationship of genes is independent from the mutation process when mutations are selectively neutral. A gene genealogy is the direct product of DNA replication and always exists, while mutations may or may not happen within a certain time period in some specific DNA region. Therefore, even if many nucleotide sequences happened to be identical, there must be genealogical relationship for those sequences. However, it is impossible to reconstruct the genealogical relationship without mutational events. In this respect, search of mutational events from genes and their products is also important for reconstructing phylogenetic trees. Advancement of molecular biotechnology made it possible to routinely produce gene genealogies from many nucleotide sequences. Figure 4 .1 shows a schematic gene genealogy for 10 genes. There are two types of genes that have small difference in their nucleotide sequences, depicted by open and full circles. Both types are located in the same location in one particular chromosome of this organism. This location is called "locus" (plural form is "loci"), after a Latin word meaning place, and one type of nucleotide sequence is called "allele," using a Greek word αλλο meaning different. Open circle allele, called allele A, is ancestral type, and full circle allele, called allele M, emerged by a mutation shown as a star mark. The numbers of gene copies are 8 and 2 for alleles A and M. We thus define allele frequencies of these two alleles as 0.8 (=8/10) and 0.2 (=2/10). Allele frequency is sometimes called gene frequency. It should be noted that these frequencies are exact values if there are only 10 genes in the population in question. If these 10 genes were sampled from that population with many more genes, two values are sample allele frequency. Because all these 10 genes are homologous at the same locus, they have the common ancestral gene. Alternatively, only descendants of that common ancestral gene are considered in the gene genealogy of Fig. 4 .1. There are, however, many genes which did not contribute to the 10 genes at the present time. If we consider these genes once existed, the population history may look like Fig. 4 figure, gene genealogy starting from full circle gene at generation 1 is embedded with other genes coexisted at each generation but became extinct. If we consider the whole population, it is clear that allele frequency changes temporarily, and many genes shown in open circle did not contribute to the current generation. How can this allele frequency change occur? Natural selection does influence this change (see Chap. 5), but the more fundamental process is the random genetic drift. This occurs because a finite number of genes are more or less randomly sampled from the parental generation to produce the offspring generation. This simple stochastic process is the source of random fluctuation of allele frequencies through generations. The random genetic drift can be described as follows. Let us focus on one particular diploid population with N[t] individuals at generation t. We consider certain autosomal locus A, and the total number of genes on that locus at generation t is 2N[t]. There are many alleles in locus A, but let us consider one particular allele A i with n i gene copies. By definition, allele frequency p i for allele A i is n i /2N[t]. When one sperm or egg is formed via miosis, one gene copy is included in that gamete from locus A. If male and female are assumed to be more or less the same allele frequency, the probability to have allele A i in that gamete is p i . This procedure is a Bernoulli trial, and the offspring generation at time t+1 will be formed with 2N where xCy is the possible combinations to choose y out of x. If we continue this binomial distribution for many generations, the random genetic drift will occur. When the number of individuals in that population, or population size, is quite large, this fluctuation is small because of "law of large numbers" in probability theory, yet the effect of random genetic drift will never disappear under finite population size. The random genetic drift was extensively studied by Sewall Wright and was sometimes called Wright effect. Francis Galton, a half cousin of Charles Darwin, was interested in extinction probability of surnames. He was thus trying to compute probability of surname extinction. He himself could not reach appropriate answer, so he asked some mathematicians. Eventually he was satisfied with a solution given by H. W. Watson, who used generating function, and they published a joint paper in 1874 [9] . Because of this history, the mathematical model considered by them is sometimes called "Galton-Watson process," but usually it is called "branching process" (see [10] for detailed description of this process). It may be noted that surnames have been studied in human genetics (e.g., [11] ) and in anthropology (e.g., [12] ), for their transmissions often coincide with Y chromosome transmissions. Fisher (1930; [13] ) applied this process to obtain the probability of mutants to be ultimately fixed or become extinct. Later in 1940s, when physicists in the USA developed the atomic bomb, the branching process was used to analyze behavior of neutron number changes (see [14] ). The distribution of transmission probability of gene copies from parents to offsprings is the basis of the branching process. The number of individuals in the population is usually not considered, for this process is mainly applied for the shallow genealogy of mutant gene copies within the large population. In a sense, the branching process is a finite small world in an infinite world. A Poisson process is the default probability distribution for the gene copy transmission under random mating. Let us explain why the Poisson process comes in. We assume a simple reproduction process where one haploid individual can reproduce one offspring n times during its life span, and the probability, p, of reproduction is uniform at each time unit (see Fig. 4 .4). The probability Prob[k] of having k offspring during the n times is given by the following binomial distribution: Fisher [13] showed that the mutant is destined to become extinct for m≤1. When m = 1, one may expect this is a stable situation and the mutant will continue to survive in the population. The population size is assumed to be infinite in the usual branching process, and this causes the mutant gene copy with m = 1 to become extinct. However, we live in finite environment, and the branching process under infinite population size is not appropriate when we consider the long-term evolution. When m>1, the mutant is advantageous, and the probability of survival becomes positive, as we will see in Chap. 5. Readers interested in application of the branching process to fates of mutant genes should refer to Crow and Kimura (1970; [15] ). Although the Poisson process is usually assumed in a random mating population, the real probability distribution of gene copy number may be different. In human study, pedigree data are used to estimate the gene transmission probability. A Kalahari San population (!Kung bushman) was reported to have a bimodal distribution of gene transmission, where the variance is larger than mean (Howell, 1979; [16] ). Interestingly, a Philippine Negrito population was shown to have an approximate Poisson distribution with mean 1.05 (Saitou et al. 1988 ; [17] ). Mutant gene transmission follows with the time arrow in the branching process. In another way, it is a forward process. However, as we saw, most of gene lineages become extinct, and it is not easy to track the lineage which will eventually propagate in the population. Now let us consider a genealogy only for sampled genes. It is natural to look for their ancestral genes, finally going back to the single common ancestral gene. This is viewing a gene genealogy as the backward process. When two gene lineages are joined at their common ancestor, this event is called "coalescence" after Kingman (1982; [18] ). It should be noted that Hudson [19] and Tajima [20] independently invented essentially the same theory in 1983. Let us consider Fig. 4 .1 again. Left most two gene copies coalesce first, followed by coalescence of two mutant genes shown in full circles. At this moment, there are eight lineages left, and one of them experienced mutation, shown in a star. After six more coalescent events, at around 2N generations ago, there are only two lineages. Then it took another ~2N generations to reach the final 9th coalescence. If there is no population structure in this organism, called "panmictic" situation, and if there is no change in population size (N), the time to reach the last common ancestral gene, or coalescent time, is expected to be 4N generations ago, according to the coalescent theory of an autosomal locus for diploid organisms. The simplest coalescent process is pure neutral evolution. Even if mutations accumulate, they do not affect survival of their offspring lineages. Because of this nature, gene genealogy and mutation accumulation can be considered separately. If natural selection, either negative (purifying) or positive, comes in for some mutant lineages, this independence between generation of gene genealogy and mutation accumulation no longer holds. Another important assumption for the simplest coalescent process is the constant population size, N. In diploid organism, the number of gene copies for an autosomal locus is 2N, while the number of gene copies for haploid organism locus is N. The former situation is assumed explicitly or implicitly in many literatures. However, the original lifestyle of organisms is haploid, and many organisms today are haploids. Therefore, we consider the situation in haploid organisms first. It should be noted that the constant population size is more or less expected if we consider a long-term evolution. Otherwise, the species will become extinct or will have exponential growth. Though we, Homo sapiens, in fact experience population explosion, this is a rather rare situation among many species. In short-term evolution, population size is expected to fluctuate for any organism. Therefore, assumption of the constant population size is not realistic and is only for mathematical simplicity. We have to be careful about this sort of too simplistic assumptions inherent in many evolutionary theories. There are some more simplifications in the original coalescent theory: discrete generation and random mating. Random mating means that any gene copy is equal in terms of gene transmission to the next generation, and there is no subpopulation structure within the population of N individuals in question. These assumptions were also used for the Wright-Fisher model. Let us first consider the coalescent of only two gene copies. What is the probability, Prob[2→1, 1], for 2 genes to coalesce in one generation? If we pick up one of these two gene copies arbitrarily, this gene, say, G1, should have its parental gene, PG1, in the previous generation. Another gene, G2, also has its parental gene PG2. Because all genes are equal in terms of gene transmission probability under our assumption, all N genes, including G1, can be PG2. We should remember Fig. 4 .5, where multiple offsprings may be produced from one individual during one generation. Therefore, to have one offspring G1 does not affect the probability of having another offspring, for these reproductions are independent. It is then obvious that The probability of the complementary event, i.e., no coalescence, can be written as Prob [2→2, 1] and We now move to slightly more complicated situation. What is the probability, Prob[2→1, t], for 2 genes to coalesce exactly after t generations? The coalescent event must occur only after no coalescence of (t−1) generations. Thus, When N is large, We can obtain the mean, Mean[2→1, t], and the variance, Var[2→1, t], of the time, t, for coalescence, using this geometric distribution: After some transformations, The variance of this exponential distribution is It can be shown that When N>>1, v [2→1, t] ~ N 2 . Therefore, the standard deviation of t is ~N generations, same as its mean. When a diploid autosomal locus is assumed, mean and variance are 2N and (2N) 2 , respectively. Let us now consider the coalescent process for n genes sampled from the population of N individuals. We assume n << N. The first step is the probability for two of n gene copies to coalesce during t generations. The probability of three gene copies to coalesce in one generation, is (1/N) 2 . If N is large, (1/N) 2 ~ 0, and we can ignore coalescence of more than 2 genes in one generation, and focus on coalescence of the only pair of genes. Because there are n C 2 [= n(n−1)/2] possible combinations to choose two out of n genes, We can thus generalize Eq. 4.7 to consider the probability that 2 genes among n genes sampled are coalesced in one generation as The mean of t under this distribution is We can then obtain the mean or expected time of coalescence from the current generation of n genes to single common ancestral gene by summing the means above: If n is large, When diploid autosomal genes are considered, this approximate mean becomes 4N, and the variance of the coalescent time, when n is large, is given by Tajima [20] : If n is not much different from N, or almost exhaustive sampling was conducted, the possibility of coalescence of three or more gene copies together at one gene copy within one generation is no longer negligible, and Eq. 4.13 and later do not hold any more. We need to consider "exact" coalescence. The following explanation is after Fu (2006; [21] ). If we consider a randomly mating population with constant size N, each gene copy at the present population was sampled from N gene copies of the previous generation with replacement. Therefore, if we choose one particular gene copy, say, copy ID 1, from the present population, the probability of its transmission from a specific gene copy of the previous generation is 1/N. Then the probability of gene copy ID 2 from the present population not sharing the same parental copy with copy ID 1 is 1 − [1/N]. We then go to the next situation in which gene copy ID 3 from the present population shares the parental gene copy with neither ID 1 nor ID 2. Its probability becomes 1 − [2/N]. Applying a similar argument for IDs 4 to n (n≤N), the probability, Prob[n→n, 1], that none of gene copy at the present generation shares the parental gene copy at the previous generation becomes Therefore, the probability corresponding to Eq. 4.14 under the exact coalescent in which n gene copies at the present generation will coalesce to m ( Dn for the majority of protein coding genes. It should be noted that the rate of synonymous substitutions may not be identical with the mutation rate, for biases of codon usages exist ( [39] ; Ikemura 1985) . We will discuss the consequences of these sorts of purifying selection on synonymous substitutions in Chap. 5. Susumu Ohno proclaimed the characteristics of mammalian genomes as "So much "junk" DNA in our genome" as early as 1972 [40] . Junk DNA means functionless DNA. In fact, only 1.5 % of the human genome is used for protein coding [41] , and the rest are mostly junk. They are interspersed repeats (LINEs and SINEs), microsatellites, other intergenic regions, and introns (see Chap. 10). It is true that a small fraction of noncoding genomic regions are highly conserved [42, 43] , and they are expected to have some functions such as enhancers. Even some SINE is known to obtain an important function during the mammalian evolution [44, 45] . It is still true that the majority of noncoding genomic regions are functionless and just junk DNAs. Recently there are some reports of transcriptions on many noncoding regions [46, 47] . However, these results were obtained by problematic ChIP-chip techniques and found to be artifact [48] by checking ChIP-seq techniques. Because the f value of Eq. 4.38 is 1 for junk DNA and for synonymous sites, their evolutionary rates are expected to be similar, if we ignore heterogeneity of mutation rates in one genome. In fact, the number (~0.15) of nucleotide substitutions per site in intergenic regions for mouse and rat genomes was shown to be quite similar to that of synonymous substitutions ( [37] ). If we ignore a small portion of functional DNAs that are highly conserved among diverse organisms, the majority (more than 90%; see Babarinde and Saitou 2013 [56] ) of mammalian or all vertebrate genomes are junk DNAs. Therefore, a genomewide divergence of two species is a good approximation of the consequence of pure neutral evolution. Pseudogenes are DNA sequences which are homologous to functional genes, but themselves are no longer functional. For example, if there are frameshift mutations and/or stop codons in a DNA sequence highly homologous to a known functional gene, it is called "pseudogene," for functional protein is expected to be not formed. Therefore, they are often products of gene duplications. Because of nonfunctional nature of pseudogenes, the pseudogenes should be genuine members of junk DNAs. Figure 4 .19 shows one of initial analysis of pseudogene evolution by Li, Gojobori, and Nei (1980; [49] ). There are four types of gene duplication (see Chap. 2). Among them, RNAmediated duplication produces intronless sequences via reverse transcription of mRNAs. These cDNAs will be integrated to a DNA region unrelated to its place of origin, where a series of gene regulatory sequences exist. Therefore, such cDNA inserts are almost always 'dead on arrival'. We can see a clear enhancement of evolutionary rate for intronless, or processed, pseudogenes for the mouse p53 gene. The estimated numbers of nucleotide substitutions between M. musculus and M. leggada are 0.0157 and 0.0651 for functional genes and pseudogenes, respectively (data from Nonfunctionalization can happen without gene duplication. Vitamins are molecules that exist in small quantity but essential for organisms, especially human, to survive. By definition, vitamins are not produced by the organism itself, and they should be taken in as a part of food. Their very existences are enigmatic, for these molecules are coming from other organisms which produce them. If vitamins are so important, why are they not produced by a certain species such as human? The neutral theory of evolution easily resolves this paradox. If vitamins are abundant in everyday foods, even the mutants with no ability of producing a certain vitamin are selectively neutral compared to wild types with ability to produce that vitamin through the existing enzymatic pathway. Vitamin C, or ascorbic acid, is a good example. If appropriate intake of vitamin C is stopped for a long time, human will develop scurvy. King and Jukes (1969; [50] ) already predicted that the lack of ascorbic acid production could be explained by assuming the neutral evolution. Not only human but all primates except for prosimians, elephants, guinea pigs, and fruit bats lack the ability of producing ascorbic acid [51] . Medaka, a teleost fish, also does not produce ascorbic acid [52] . In fact, nonfunctionalization of L-gulono-γ-lactone oxidase (enzyme number E.C.1.1.3.8) gene was confirmed by Nishikimi and his collaborators [53] . A more drastic situation of pseudogene formation without gene duplication is found in parasitic bacterial genomes. Mycobacterium leprae, the causative bacteria of leprosy, was found to have many pseudogenes in its genome ( [54] ). This is because this bacterium is hiding deep in host body and receives many nutrients from host. A gene function is often quite complex, and it is not easy to determine if a "pseudogene" is really nonfunctional. Even if protein is not produced, mRNA or even DNA sequences themselves may still have some function. Therefore, when we discuss about the evolution of pseudogenes, it may be too simplistic to assume that f, fraction of neutral mutations, is 1 for a pseudogene. A "pseudogene" with some function is not surprising, for they were named so only because of sequence comparison. So far, we discussed evolution of nucleotide or amino acid sequences and saw that the fixations of selectively neutral mutations are the major process of evolution. It is thus natural to expect that the evolution at the macroscopic or so-called phenotypic level is also following mostly neutral fashion. Unfortunately, this logically derived conjecture seems to be not kept by many evolutionary biologists. Ever since Charles Darwin, many biologists have been enchanted by seemingly powerful positive selection. They are biologists who study macroscopic morphology of organisms, animal behaviors, developmental process, and so on. As we will see in Chap. 5, we should be careful to discuss adaptation without clear demonstration at the molecular level. It may be still optimistic to expect a rapid expansion of our knowledge on the genetic basis of developmental and behavioral traits in the near future. However, modern biology is proceeding to this direction, and I personally hope that the superficial dichotomy between molecules (genotypes) and phenotypes will disappear sooner or later. Evolutionary genomics is at the foundation of this edifice of modern biology. It should be added that Nei's (2013) recent book "Mutationdriven evolution" ( [58] ) covers many interesting topics related to this chapter. The neutral theory of molecular evolution Molecular evolutionary genetics Initial sequencing and comparative analysis of the mouse genome Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution Evolutionary rate at the molecular level The social amoebae: The biology of cellular slime molds Asexual reproduction: A further consideration Mathematical population genetics On the probability of the extinction of families Branching processes: Variation, growth, and extinction of populations The estimation of inbreeding from isonymy An attempt to estimate the migration pattern in Japan by surname data The distribution of gene ratios for rare mutations Introduction to probability theory and its applications An introduction to population genetics theory Demography of the Dobe !Kung On the effect of the fluctuating population size on the age of a mutant gene On the genealogy of large populations Testing the constant rate neutral allele model with protein sequence data Evolutionary relationship of DNA sequences in finite populations Exact coalescent for the Wright-Fisher model Gene genealogies, variation, and evolution -a primer in coalescent theory Coalescent theory: An introduction Solution of a process of random genetic drift with a continuous model Diffusion models in population genetics Protein polymorphism as a phase of molecular evolution The number of alleles that can be maintained in a finite population The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations Genetic variability maintained in a finite population due to mutational production of neutral and nearly neutral isoalleles Detection of large-scale variation in the human genome Large-scale copy number polymorphism in the human genome Evolutionary divergence and convergence in proteins Mutation and evolution at the molecular level Evidence for higher rates of nucleotide substitution in rodents than in man Rates of nucleotide substitution are evidently higher in rodents than in man Evolutionary and biomedical insights from the rhesus macaque genome Contribution of Asian mouse subspecies Mus musculus molossinus to genomic constitution of strain C57BL/6J, as defined by BAC end sequence-SNP analysis The lens protein alpha A-crystallin of the blind mole rat, Spalax ehrenbergi: Evolutionary change and functional constraints Codon usage and tRNA content in unicellular and multicellular organisms So much "junk" DNA in our genome Finishing the euchromatic sequence of the human genome Ultraconserved elements in the human genome Identification and characterization of lineage-specific highly conserved noncoding sequences in mammalian genomes A distal enhancer and an ultraconserved exon are derived from a novel retroposon Possible involvement of SINEs in mammalian-specific brain formation Dark matter in the genome: Evidence of widespread transcription detected by microarray tiling experiments Identification and analysis of functional elements in 1 % of the human genome by the ENCODE pilot project Most "dark matter" transcripts are associated with known genes Pseudogenes as paradigm of the neutral evolution Non-Darwinian evolution Biochemistry Transgenic expression of L-gulono-gamma-lactone oxidase in medaka (Oryzias latipes), a teleost fish that lacks this enzyme necessary for L-ascorbic acid biosynthesis Cloning and chromosomal mapping of the human nonfunctional gene for L-gulono-gamma-lactone oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man Massive gene decay in the leprosy bacillus Genomu Shinkagaku Nyumon (written in Japanese, meaning 'Introduction to evolutionary genomics') Heterogeneous tempo and mode of conserved noncoding sequence evolution among four mammalian orders The presence/absence polymorphism and evolution of p53 pseudogene within the genus Mus Mutation-driven evolution Allelic genealogy and human evolution