- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Begun, D. J.
- Articles by Whitley, P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Begun, D. J.
- Articles by Whitley, P.
Molecular Population Genetics of Xdh and the Evolution of Base Composition in Drosophila
David J. Beguna,b and Penn Whitleyaa Section of Integrative Biology, University of Texas, Austin, Texas 78712
b Section of Evolution and Ecology, University of California, Davis, California 95616
Corresponding author: David J. Begun, University of California, Davis, CA 95616., djbegun{at}ucdavis.edu (E-mail)
Communicating editor: W. STEPHAN
| ABSTRACT |
|---|
Few loci have been measured for DNA polymorphism and divergence in several species. Here we report such data from the protein-coding region of xanthine dehydrogenase (Xdh) in 22 species of Drosophila. Many of our samples were from closely related species, allowing us to confidently assign substitutions to individual lineages. Surprisingly, Xdh appears to be fixing more A/T mutations than G/C mutations in most lineages, leading to evolution of higher A/T content in the recent past. We found no compelling evidence for selection on protein variation, though some aspects of the data support the notion that a significant fraction of amino acid polymorphisms are slightly deleterious. Finally, we found no convincing evidence that levels of silent heterozygosity are associated with rates of protein evolution.
THE nucleotide substitution process may be affected by many biological and statistical phenomena. The complexity of the process is probably a major reason why our understanding of it is only rudimentary. Discussion of the possible role of effective population size (NE) on evolutionary rates has been ongoing for many decades (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Molecular population genetic data from Drosophila melanogaster and D. simulans have revealed several interesting differences between these lineages, many of which have been interpreted in terms of differences in NE (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
However, the fact that differences in NE may be consistent with patterns of nucleotide variation in melanogaster and simulans provides only weak supporting evidence for the population-size hypothesis because the species may differ in many ways besides population size. This leaves room for an agnostic position on the cause of the genomic differences between these two species and motivates the study presented herean analysis of silent and replacement variation at a single locus in population samples from several Drosophila species. We specifically selected groups of closely related species for most of our analyses. There were two primary motivations for this sampling strategy. First, low sequence divergence between species allows us to infer individual substitutions occurring on single evolutionary lineages. This reduces our dependence on uncertain models of the substitution process and allows us to test hypotheses by counting polymorphic and fixed mutations instead of estimating parameters of molecular evolution over longer time periods. Second, our sampling strategy provides an opportunity to investigate the connection between polymorphism and divergence for a homologous region of DNA in many species. We selected xanthine dehydrogenase (Xdh, corresponding to the ry locus of D. melanogaster) for this study. One factor motivating this choice was the allozyme (e.g., Figure 9.3 of ![]()
![]()
![]()
| MATERIALS AND METHODS |
|---|
Samples:
Flies/DNA used for isolating Xdh alleles, and the number of alleles isolated from each species, are given in Table 1.
|
PCR, cloning, and sequencing:
Several PCR primers or degenerate PCR primers were designed from regions of the Xdh protein that were highly conserved among D. melanogaster, D. pseudoobscura, and Bombyx mori. This region spanned residues 142150 or 206215 of the D. melanogaster Xdh protein for forward primers or residues 737749 of the melanogaster protein for reverse primers. Several combinations of primer pairs were used to amplify Xdh fragments from different species. For species in which PCR was only marginally effective, a single allele was isolated from a gel-purified PCR product, cloned, and sequenced. These data were subsequently used to design species-specific primers. DNA isolated from single wild-caught flies was used in PCR for species provided by W. Etges. For lines established by us or provided by the Species Center, DNA was isolated from single flies taken from isofemale lines. PCR was carried out on DNA from a multifly prep provided by J. Powell (![]()
1500 bases were sequenced for each allele. This encompasses
38% of the 1335-amino-acid-long melanogaster Xdh protein. The error rate of the polymerase is given as
3 x 10-7 (Boehringer-Mannheim product literature), so we expect very few errors in the
1.8 x 105 bp (120 alleles x
1500 bp/allele) of reported sequence. DNaSP (v. 3.51, ![]()
Polarizing mutations:
In most cases, phylogenetic relationships based on previously published data (summarized in ![]()
| RESULTS |
|---|
Heterozygosity:
Table 2 shows estimates of silent and replacement heterozygosity at the Xdh locus for each of 22 species of Drosophila. Divergence among closely related species for silent and replacement sites is shown in Table 3. Very few sites are polymorphic in each of two sister taxa (with the exception of pseudoobscura and persimilis), suggesting that polymorphism data from each species can be considered as independent. Silent
varies from a high of 0.064 to a low of 0.008; replacement
varies from a high of 0.0080 to a low of 0.0014. Average heterozygosity is not significantly heterogeneous across species groups for silent sites (Kruskal-Wallis test, P = 0.09, species for which n < 4 are omitted) or replacement sites (Kruskal-Wallis test, P = 0.60, species for which n < 4 are omitted). Replacement and silent heterozygosity across species should be positively correlated across species under a neutral model of evolution with a homogeneous mutation rate. Our estimates of silent and replacement
for 17 species were positively correlated (Spearman's
= 0.56) and significantly different from zero (P = 0.02; Fig 1). The ratio of replacement to silent
was not significantly heterogeneous across the repleta, obscura, virilis, and willistoni species groups (Table 2 and Table 4; Kruskal-Wallis test, P = 0.35).
|
|
|
|
Frequency spectrum:
Table 5 shows summaries of the frequency spectrum of polymorphism as estimated by Tajima's D (![]()
![]()
![]()
|
Substitutions:
Substitutions between pairs of closely related species can be assigned to individual lineages under parsimony given an outgroup and a single mutation in the history of the site under consideration. Inference of the ancestral state for a pair of species should be reliable at the low level of sequence divergence observed for many of our species pairs (Table 3). Numbers of silent and replacement substitutions occurring on individual lineages are given in Table 6. A test of the contingency table of 16 species and silent vs. replacement fixations is only marginally significantly heterogeneous (G-test, P = 0.043). The ratio of replacement to silent fixations is not significantly heterogeneous across the repleta, obscura, virilis, and willistoni species groups (G-test, P = 0.83). Similarly, the ratio of replacement to silent fixations is not significantly different from the ratio of replacement to silent polymorphisms for any species group.
|
Correlation between heterozygosity and protein divergence:
If silent mutations are neutral, silent site heterozygosity can be used as an estimator of NE . Under this premise we could investigate whether effective population size is correlated with the proportion of replacement to silent fixations along a lineage. We proceed under the assumption that silent heterozygosity is highly positively correlated with NE, though we acknowledge that selection may result in a complex relationship between silent heterozygosity and NE (e.g., ![]()
![]()
![]()
![]()
vs. the ratio of replacement to silent fixations; each point represents the heterozygosity and the ratio of silent to replacement fixations for a single species. We find no evidence for a correlation between these two variables (Spearman's
= -0.12, P = 0.66). However, the fact that several lineages have fixed a small number of mutations (Table 6) would inflate the sampling variance of the ratio of replacement to silent fixations, thereby reducing our power to detect any underlying relationship between this ratio and other variables. We consider eight independent pairwise comparisons of silent and replacement divergence to reduce the severity of this problem (Table 3). For each species pair we compared average silent heterozygosity to the ratio of replacement to silent divergence. There is no significant correlation (Spearman's
= 0.17, P = 0.66). However, data from the willistoni/equinoxialis pair appear to be quite different from the other groups in that the level of polymorphism is unusually high. Data from the seven remaining species pairs reveal a strong, nearly significant positive correlation (Spearman's
= 0.75, P = 0.066) between average silent heterozygosity and protein divergence (Fig 3).
|
|
Evolution of codon bias and base composition:
Preferred and unpreferred mutants are hypothesized to be slightly beneficial alleles and slightly deleterious alleles, respectively. Silent polymorphisms and fixations were categorized as preferred or unpreferred and polarized using parsimony (![]()
![]()
![]()
![]()
![]()
|
|
Under an equilibrium model of codon bias evolution, one expects equal numbers of unpreferred and preferred fixations. Our fixation data, pooled across lineages (Table 7), clearly deviate from this expectation (binomial probability, P < 0.001). This result is not attributable to a small number of lineages that deviate strongly from the equilibrium prediction, but rather results from a general trend toward unpreferred fixations in most lineages14 of 18 lineages have fixed more unpreferred than preferred mutations.
Similarly, there is a strong trend in the direction of excess unpreferred polymorphisms. Seventeen of 18 species have greater numbers of unpreferred than preferred polymorphisms. Under the neutral equilibrium model, the proportion of unpreferred to preferred polymorphisms should be equal to the proportion of unpreferred to preferred fixations. Comparison of unpreferred and preferred polymorphisms and fixations in simulans and melanogaster showed that simulans has a significant excess of unpreferred polymorphisms, while the polymorphic and fixed mutants in melanogaster were compatible with the strictly neutral model (![]()
![]()
![]()
A caveat regarding these conclusions comes from separate analysis of data from the repleta group, which are consistent with the aforementioned results primarily because of the very large number of unpreferred fixations in the eremophila lineage (Table 7). If these data are omitted, the remaining repleta group data appear to differ from data from other species groups. Polymorphic and fixed, preferred and unpreferred mutations are significantly heterogeneous (P < 0.001). Furthermore, there is no excess of unpreferred fixations (25 preferred vs. 37 unpreferred, binomial probability, P = 0.08). Of the 8 non-eremophila repleta lineages, only 4 (Table 6) have fixed more unpreferred than preferred mutations [i.e., all 4 lineages (of 18 total lineages) that have not fixed more unpreferred mutations are from the repleta group].
All preferred melanogaster codons end in G or C (![]()
|
A smaller NE in melanogaster than in simulans was proposed as a possible explanation for the much higher proportion of unpreferred fixations in the former (![]()
= 0.29, P = 0.25). In fact, the two willistoni group species have very high levels of silent heterozygosity, yet show low levels of codon bias at Xdh.
|
| DISCUSSION |
|---|
Silent substitutions at Xdh are biased toward unpreferred mutations. The two best-studied Drosophila species, melanogaster and simulans, have each fixed more unpreferred than preferred mutations (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
We can entertain at least three kinds of explanations for evolution of A/T content. First, lineages may evolve in response to a change of A/T mutation bias in an ancestor. Second, lineages may evolve in response to selection favoring A/T (this requires independent selection favoring A/T in different lineages). Finally, A/T accumulation may reflect fixation of slightly deleterious (unpreferred) mutations by genetic drift. This final hypothesis may seem unlikely for our Xdh data because it invokes reduction of fitness in several Drosophila lineages that are biologically and historically distinct. Nevertheless, global factors (e.g., temperature) reducing Drosophila population sizes at some point during the last several million years could have promoted fixation of very slightly deleterious alleles, as suggested by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Although there are broad generalizations to be gleaned from the data, it also seems clear that there may be strong lineage effects on patterns of base composition evolution. For example, eremophila appears to have experienced an atypically large excess of unpreferred fixations relative to the other repleta lineages. This historical inference is consistent with the fact that eremophila Xdh is the least biased of the repleta group samples. Comparison of the repleta group to other species groups also supports the importance of lineage effects. Excluding eremophila, the repleta group shows no significant accumulation of unpreferred fixations. This is in contrast to pooled data from the other species groups, which show a highly significant excess of unpreferred fixations. Patterns of codon bias are consistent with this inference in that codon bias of repleta species (mean ENC = 39.8) is greater than that of other groups (mean obscura ENC = 45.3; mean virilis ENC = 44.2; mean willistoni ENC = 55.5). Major lineage effects can also be inferred simply by noting that A/T content at third positions among species for this region of Xdh ranges from 16.5 to 54.9%. Given that there was a single common ancestor with a particular A/T content, the wide range of current A/T contents is indicative of a heterogeneous substitution process (e.g., ![]()
![]()
![]()
![]()
Previously collected data on polymorphism and divergence at silent sites suggested the presence of excess unpreferred (i.e., putatively borderline deleterious) polymorphisms in simulans, but not in melanogaster (![]()
![]()
![]()
![]()
|
Under a neutral model with homogeneous mutation rates, the fraction of new silent and replacement mutants that are neutral is expected to be the same across populations of different effective sizes. As predicted under this model, we observed a positive correlation between silent and replacement heterozygosity. However, other aspects of the data suggest the possibility of different dynamics of protein vs. silent variants. First, compared to silent polymorphisms, replacement polymorphisms consistently show greater skew toward rare alleles. Second, there is a marginally significant negative correlation between silent heterozygosity and Tajima's D for replacement polymorphisms (Fig 6, Spearman's
= -0.51, P = 0.04). In other words, populations that harbor more silent variation tend to have amino polymorphisms that are more highly skewed toward rare alleles. This is the pattern one might expect if silent polymorphism were more highly correlated with NE than was replacement polymorphism and if selection against slightly deleterious amino acid mutations were more effective in larger populations (![]()
![]()
|
The analyses presented here are, to our knowledge, the first attempt to investigate whether nucleotide substitution rates differ along lineages leading to more polymorphic vs. less polymorphic species (but see ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
could limit our power to detect a correlation. Finally, Xdh amino acid variants could be under strong directional selection only in some lineages and/or at some times. It seems that the attempt to rule out certain models of molecular evolution using associations of heterozygosity and replacement substitution rates will be difficult.
| FOOTNOTES |
|---|
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos.
AF543072,
AF543189. ![]()
| ACKNOWLEDGMENTS |
|---|
We thank the individuals listed in Table 1 for providing flies or DNA. J. Gillespie, C. Langley, and two anonymous reviewers provided useful comments. This work was supported by the National Institutes of Health, the National Science Foundation, and a Sloan Young Investigator Award in Molecular Evolution.
Manuscript received January 21, 2002; Accepted for publication September 19, 2002.
| APPENDIX |
|---|
|
| LITERATURE CITED |
|---|
AKASHI, H., 1995 Inferring weak selection from patterns of polymorphism and divergence at silent sites in Drosophila. Genetics 139:1067-1076.[Abstract]
AKASHI, H., 1996 Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster.. Genetics 144:1297-1307.[Abstract]
AKASHI, H., 1999 Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination. Genetics 151:221-238.
AKASHI, H. and S. W. SCHAEFFER, 1997 Natural selection and the frequency distributions of "silent" DNA polymorphism in Drosophila. Genetics 146:295-307.[Abstract]
ANDOLFATTO, P., 2001 Contrasting patterns of X-linked and autosomal nucleotide variation in Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 18:279-290.
AQUADRO, C. F., K. M. LADO, and W. A. NOON, 1988 The rosy region of Drosophila melanogaster and Drosophila simulans. I. Contrasting levels of naturally occurring DNA restriction map variation and divergence. Genetics 119:875-878.
BEGUN, D. J., 1996 Population genetics of silent and replacement variation in Drosophila simulans and D. melanogaster: X/autosome differences? Mol. Biol. Evol. 13:1405-1407.[Medline]
BEGUN, D. J., 2001 The frequency distribution of nucleotide variation in Drosophila simulans.. Mol. Biol. Evol. 18:1343-1352.
BERGMAN, C. M. and M. KREITMAN, 2001 Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res. 11:1335-1345.
CHERRY, J. L., 1998 Should we expect substitution rate to depend on population size? Genetics 150:911-919.
CHOUDHARY, M. and R. S. SINGH, 1987 A comprehensive study of genic variation in Drosophila melanogaster. III. Variations in genetic structure and their causes between Drosophila melanogaster and its sibling species Drosophila simulans. Genetics 117:697-710.
FISHER, R. A., 1958 The Genetical Theory of Natural Selection. Dover Publications, New York.
GILLESPIE, J. H., 1999 The role of population size in molecular evolution. Theor. Popul. Biol. 55:145-156.[Medline]
GILLESPIE, J. H., 2000 Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155:909-919.
GILLESPIE, J. H., 2001 Is the population size of a species relevant to its evolution? Evolution 55:2161-2169.[Medline]
GLEASON, J. M. and J. R. POWELL, 1997 Interspecific and intraspecific comparisons of the period locus in the Drosophila willistoni sibling species. Mol. Biol. Evol. 14:741-753.[Abstract]
HEY, J. and R. M. KLIMAN, 1993 Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10:804-822.[Abstract]
KIMURA, M., 1983 The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.
KLIMAN, R. M., P. ANDOLFATTO, J. A. COYNE, F. DEPAULIS, and M. KREITMAN et al., 2000 The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics 156:1913-1931.
KREITMAN, M., and M. ANTEZANA, 2000 Population and evolutionary genetics of codon usage in Drosophila, pp. 82101 in Evolutionary Genetics: From Molecules to Morphology, edited by R. SINGH and C. KRIMBAS. Cambridge University Press, Oxford.
MCVEAN, G. A. T. and B. CHARLESWORTH, 1999 A population genetic model for the evolution of synonymous codon usage: patterns and predictions. Genet. Res. 74:145-158.
MCVEAN, G. A. T. and J. VIEIRA, 1999 The evolution of codon preference in Drosophila: a maximum-likelihood approach to parameter estimation and hypothesis testing. J. Mol. Evol. 49:63-75.[Medline]
MCVEAN, G. A. T. and J. VIEIRA, 2001 Inferring parameters of mutation, selection and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245-257.
MORIYAMA, E. and J. R. POWELL, 1996 Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13:261-277.[Abstract]
OHTA, T., 1992 The nearly neutral theory of molecular evolution. Annu. Rev. Ecol. Syst. 23:263-286.
POWELL, J. R., 1997 Progress and Prospects in Evolutionary Biology: The Drosophila Model. Oxford University Press, New York.
RILEY, M. A., S. R. KAPLAN, and M. VEUILLE, 1992 Nucleotide polymorphism at the xanthine dehydrogenase locus in Drosophila pseudoobscura. Mol. Biol. Evol. 9:56-69.[Abstract]
RODRIGUEZ-TRELLES, F., R. TARRIO, and F. J. AYALA, 1999 Switch in codon bias and increased rates of amino acid substitution in the Drosophila saltans species group. Genetics 153:339-350.
RODRIGUEZ-TRELLES, F., R. TARRIO, and F. J. AYALA, 2000 Fluctuating mutation bias and the evolution of base composition in Drosophila. J. Mol. Evol. 50:1-10.[Medline]
ROZAS, J. and R. ROZAS, 1999 DnaSP 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174-175.
SKIBINSKI, D. O. F. and R. D. WARD, 1982 Correlations between heterozygosity and evolutionary rate of proteins. Nature 298:490-492.
TAJIMA, F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585-595.
TAKANO, T. S., 1998 Rate variation of DNA sequence evolution in the Drosophila lineages. Genetics 149:959-970.
TAKANO-SHIMIZU, T., 2001 Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol. Biol. Evol. 18:606-619.
WRIGHT, F., 1990 The effective number of codons used in a gene. Gene 87:23-39.[Medline]
WRIGHT, S., 1931 Evolution in Mendelian populations. Genetics 16:97-159.
WRIGHT, S., 1932 The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of the Sixth International Congress in Genetics, Vol 1. pp. 356366.
This article has been cited by other articles:
![]() |
F. C. Almeida and R. DeSalle Evidence of Adaptive Evolution of Accessory Gland Proteins in Closely Related Species of the Drosophila repleta Group Mol. Biol. Evol., September 1, 2008; 25(9): 2043 - 2053. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Nolte and C. Schlotterer African Drosophila melanogaster and D. simulans Populations Have Similar Levels of Sequence Variability, Suggesting Comparable Effective Population Sizes Genetics, January 1, 2008; 178(1): 405 - 412. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.-Y. Ko, S. Piao, and H. Akashi Strong Regional Heterogeneity in Base Composition Evolution on the Drosophila X Chromosome Genetics, September 1, 2006; 174(1): 349 - 362. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Wagstaff and D. J. Begun Molecular Population Genetics of Accessory Gland Protein Genes and Testis-Expressed Genes in Drosophila mojavensis and D. arizonae Genetics, November 1, 2005; 171(3): 1083 - 1101. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Balakirev, V. R. Chechetkin, V. V. Lobzin, and F. J. Ayala Entropy and GC Content in the {beta}-esterase Gene Cluster of the Drosophila melanogaster Subgroup Mol. Biol. Evol., October 1, 2005; 22(10): 2063 - 2072. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Bartolome, X. Maside, S. Yi, A. L. Grant, and B. Charlesworth Patterns of Selection on Synonymous and Nonsynonymous Variants in Drosophila miranda Genetics, March 1, 2005; 169(3): 1495 - 1507. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Halligan, A. Eyre-Walker, P. Andolfatto, and P. D. Keightley Patterns of Evolutionary Constraints in Intronic and Intergenic DNA of Drosophila Genome Res., February 1, 2004; 14(2): 273 - 279. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Piccinali, M. Aguade, and E. Hasson Comparative Molecular Population Genetics of the Xdh Locus in the Cactophilic Sibling Species Drosophila buzzatii and D. koepferae Mol. Biol. Evol., January 1, 2004; 21(1): 141 - 152. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Tarrio, F. Rodriguez-Trelles, and F. J. Ayala A new Drosophila spliceosomal intron position is common in plants PNAS, May 27, 2003; 100(11): 6580 - 6583. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Begun, D. J.
- Articles by Whitley, P.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Begun, D. J.
- Articles by Whitley, P.









