- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Keightley, P. D.
- Articles by Bataillon, T. M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Keightley, P. D.
- Articles by Bataillon, T. M.
Multigeneration Maximum-Likelihood Analysis Applied to Mutation-Accumulation Experiments in Caenorhabditis elegans
Peter D. Keightleya and Thomas M. Bataillonba Institute of Cell, Animal and Population Biology, University of Edinburgh, Edinburgh EH9 3JT, Scotland
b Institut National de la Recherche Agronomique-Station de Génétique et Amélioration des Plantes, Montpellier, Domaine de Melgueil, F 34130 Maugio, France
Corresponding author: Peter D. Keightley, Institute of Cell, Animal and Population Biology, University of Edinburgh, W. Mains Rd., Edinburgh EH9 3JT, Scotland., p.keightley{at}ed.ac.uk (E-mail)
Communicating editor: A. G. CLARK
| ABSTRACT |
|---|
We develop a maximum-likelihood (ML) approach to estimate genomic mutation rates (U) and average homozygous mutation effects (s) from mutation-accumulation (MA) experiments in which phenotypic assays are carried out in several generations. We use simulations to compare the procedure's performance with the method of moments traditionally used to analyze MA data. Similar precision is obtained if mutation effects are small relative to the environmental standard deviation, but ML can give estimates of mutation parameters that have lower sampling variances than those obtained by the method of moments if mutations with large effects have accumulated. The inclusion of data from intermediate generations may improve the precision. We analyze life-history trait data from two Caenorhabditis elegans MA experiments. Under a model with equal mutation effects, the two experiments provide similar estimates for U of ~0.005 per haploid, averaged over traits. Estimates of s are more divergent and average at -0.51 and -0.13 in the two studies. Detailed analysis shows that changes of mean and variance of genetic values of MA lines in both C. elegans experiments are dominated by mutations with large effects, but the analysis does not rule out the presence of a large class of deleterious mutations with very small effects.
EXPERIMENTAL estimates of rates at which mutations occur in the genome and properties of distributions of mutation effects for fitness and other life-history traits are important for several questions in population and evolutionary biology, but have proved to be extremely difficult to obtain (![]()
![]()
![]()
The traditional way to analyze MA experimental data is the Bateman-Mukai (BM) method of moments (![]()
![]()
![]() |
(1) |
where
M is the rate of change of the mean trait value per generation and Vm is the mutational variance. This is estimated as one-half of the rate of increase in MA among line variance per generation, VL (![]()
![]() |
(2) |
If mutations have variable effects, (1) underestimates U and (2) overestimates s. The BM method uses only information obtained from the means and variances of the MA and control lines, but the data may contain additional information that could be used for estimating the mutation parameters. A more recent maximum-likelihood procedure (![]()
![]()
![]()
A drawback with the ML method is that it is quite complex, and has to date been implemented only for cases with a single MA generation plus a control line. In this article we extend the ML approach to analyze data from experiments with an arbitrary number of generations, and thereby make use of all the available information including covariances between phenotypic values for the same lines at different generations. At present, our multigeneration ML method is restricted to the case of equal mutation effects, i.e., the same model as assumed by the BM method. However, the method allows the comparison of results for different experiments under the same model. We investigate the properties of the method by Monte Carlo simulation and compare its precision to the BM approach. Finally, we use the multigeneration ML procedure to analyze data on life-history traits from two recently published MA experiments with the wild-type N2 strain of Caenorhabditis elegans (![]()
![]()
| MATERIALS AND METHODS |
|---|
Likelihood framework for several generations:
In this section we derive the likelihood function, which is appropriate when data from several generations are jointly used to estimate genome-wide mutation parameters, by extending a previously developed method (![]()
Let Zk,tj denote the phenotypic value of line k assayed after tj generations of mutation accumulation. We assume that the number of mutations fixed in the homozygous state in the line is Poisson distributed with mean
j = Utj. The mutations are assumed to have a constant additive effect denoted by s. We assume that environmental effects are normally distributed with variance Ve. The phenotypic value is, therefore,
![]() |
(3) |
where M is the ancestral mean, x is a Poisson deviate with parameter
j, and e is a Gaussian deviate with mean zero and variance Ve. The likelihood associated with a single line k observed at generations, say, t1 and t1 + t2 is
![]() |
(4) |
where ptx(i) denotes the (Poisson) probability that the line has accumulated i new homozygous mutation(s) during the course of tx generations and f is the Gaussian probability density function with mean M and variance Ve. The overall likelihood is then obtained as
=
Nk=1
k, where N is the number of lines. Control line data can be included in the analysis by including appropriate terms in (4) with U set to zero.
This likelihood equation can be generalized to incorporate an arbitrary number of points in time where the lines are assayed. Suppose that the set of MA lines are assayed at generation t1, t1 + t2, ... , t1 + t2 + ...tT; then the likelihood associated with a line k is
![]() |
(5) |
If fixed effects are to be fitted, such as assay or block effects for each point in time (t1, t1 + t2, ... ), likelihood Equation 4 is modified as:
![]() |
(6) |
where bt1, bt1+t2 are fixed effects to be estimated jointly with U and s. Note recorder effects could also be included by replacing the bt1, bt1+t2 terms in (6) with terms such as r1, r2 ... for each recorder. If several replicates are assayed to estimate the genetic value of a MA line the likelihood equations can be modified easily to incorporate this detail (![]()
Simulation protocol:
To compare the precision of the BM estimator and the ML multigeneration estimators, we carried out Monte Carlo simulations of MA experiments. To a good approximation, the size of a MA experiment can be characterized by its "heritability" at the line level h2L =
+ VL (![]()
![]()
![]()
M and Vm by regression of phenotypic means and variances on generation number.
In addition, we investigated the performance of the BM and ML procedures with data in which mutation effects are exponentially distributed; thus data are analyzed under the "wrong" model of equal effects. For each U, s combination simulated under the constant mutation effects model, we performed simulations with the same U and mean mutation effect
= s, assuming an exponential distribution, and with VEL adjusted to achieve the same heritability at the line level.
Likelihood maximization:
ML maximization was carried out using the simplex algorithm (![]()
![]()
C. elegans data sets:
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
and decrease Û. Estimates of the Box-Cox exponent for productivity and longevity are nonsignificantly different from 1.
Analysis under a variable mutation-effects model:
A multigeneration ML procedure along the lines described above to estimate parameters of models with variable mutation effects was found to be computationally intractable at the present time. To test for evidence of variability among mutation effects in the C. elegans data, therefore, we analyzed the last generation of MA line data plus control data (line means) with a single-generation ML procedure (![]()
) ranging from equal mutation effects (ß
) to more leptokurtic distributions (ß < 1) was compared.
| RESULTS |
|---|
ML analysis of simulated MA experiments with equal mutation effects:
Simulation results are summarized in Table 1 for values of U and corresponding values of s such that VL remains constant and
= 5 (see MATERIALS AND METHODS). Over the range of parameter values simulated, BM and ML give mean estimates close to the simulated values. As expected, if the effect of each mutation is small compared to the environmental standard deviation of line means (
eL, 0.0566 in this case), ML and BM estimators perform similarly. However, if the effects of mutations are relatively large (s >
eL), the ML estimator makes more efficient use of the information available and can have a much smaller variance than the BM estimator (see variance of the estimators empirically determined through Monte Carlo simulations in Table 1). The difference in precision becomes smaller if the experiment is noiser (e.g.,
= 2; data not shown). Both ML and BM estimators can become unstable and give infinite variance among estimates if VL/VEL falls below ~1.
|
The variances of the estimates shown in Table 1 also suggest that an increase in precision can be obtained with the ML estimator (but much less so with the BM procedure) by including extra generations in the analysis. Again this suggests that ML makes more efficient use of the information available. The effect can increase dramatically as the number of generations assayed and/or the magnitude of mutation effect increases. ![]()
![]()
ML analysis of simulated MA experiments with variable mutation effects:
Simulation results are reported in Table 2. If data simulated under an exponential distribution of mutation effects model are analyzed under the assumption of constant effects, both methods give estimates of U (s) that are biased downward (upward) by a factor B. Empirically we find that B
1 +
. This was already known for the BM estimator (see, e.g., ![]()
![]()
|
Multigeneration ML analysis of C. elegans MA experiments:
Data for productivity, r, and longevity of the ![]()
![]()
![]()
![]()
|
|
ML and BM estimates of genome-wide mutation rates and average mutation effects for three traits in VASSILIEVA and LYNCH's (1999) experiment are shown in Table 3. Estimates of s are scaled relative to the control population mean. Standard errors (SEs) of estimates are much smaller under ML than BM, a result we also obtained in the Monte Carlo simulation experiments, although the improvement in precision is larger than we expected on the basis of the simulations. Defining precision as the squared coefficient of variation of an estimate, ML is 24 times and 69 times more precise than BM, on average, for U and s, respectively [ ![]()
|
|
An effect of freezing worms could influence the ![]()
![]()
= -0.55.
Models with variable mutation effects:
The line mean data from the last MA generation along with the control data were analyzed by ML under the assumption that mutations effects are gamma distributed with scale and shape parameters
and ß, respectively. In the analysis of the ![]()
=
from such analyses are shown in Table 6. To simplify the interpretation of the results, the analysis was carried out for several ß models including the equal effects model (ß
).
|
For productivity, somewhat surprisingly, in both experiments log likelihood decreases as the kurtosis of the assumed gamma distribution increases, i.e., the best-fitting gamma distribution is the limiting case of equal mutation effects with ß
. Distributions much more leptokurtic than a gamma distribution with shape parameter 1 (i.e., an exponential distribution) are inconsistent with the data on the basis of likelihood-ratio tests. For longevity and r, log likelihood for different ß models changes nonsignificantly, so these data contain little information that can allow different distributions of mutation effects to be distinguished.
| DISCUSSION |
|---|
Simulation experiments:
Over the range of parameter values studied, if the data conform to the model assumed, the ML and BM procedures give mean parameter estimates close to the simulated parameter values. If mutation effects are relatively small there is little benefit from using ML over the BM method of moments. However, if an appreciable fraction of the genetic variance is contributed by mutations with relatively large effects, ML can produce estimates with substantially lower variances than BM. Presumably, the distribution of MA line values will depart from a normal distribution, and replicates within lines may consistently deviate, so there is information to be extracted from the line value distribution in addition to the first and second moments. Furthermore, individual lines will show "jumps" between generations, and again the ML procedure will use this information. Hence there is a benefit from including additional intermediate generations. The opposite conclusion was drawn by ![]()
![]()
We also explored the robustness of the constant mutation effects model by simulating data sets in which effects of mutation are exponentially distributed. As with the case of data simulated with equal mutation effects, the BM and ML procedures perform similarly if mutation effects are small relative to the environmental standard deviation. Both methods also show similar levels of bias in this situation. If average effects of mutations are large, ML tends to be less biased and shows a higher level of precision than BM, as measured either in terms of the among-estimate variance or mean square error. As with the case of equal mutation effects, the difference in precision between ML and BM increases as the number of intermediate generations included in the analysis increases, and the effect is more apparent for the mean mutation effect than for U.
Improvement of precision under ML in analysis of C. elegans data:
In both C. elegans MA experiments, ML estimates have considerably smaller sampling variances than corresponding BM estimates. The simulation studies suggest that an improvement in precision is to be expected in general (Table 1 and Table 2), due to a more efficient used of information, but the improvement turned out to be larger than we expected on the basis of the simulations. There are three factors that may account for this:
- In the experiments, there were lines that deviated by several standard deviations from the control means and probably carried mutations with large effects. Data of this sort lead to the greatest improvement of ML over BM.
- The fitting of assay effects, which are large and significant (Table 5), removes much of the noise that clouds the results from the regression analysis. This is probably the most important factor.
- The model of equal mutation effects appears to give a good fit to the data, at least in explaining the major effect mutations (Table 6), and the improvement in precision is expected to be greatest in this case.
C. elegans MA experiments:
The negative estimates of s for productivity are in line with expectation and are in accord with the negative estimates for the mean mutation effect on r, a highly correlated trait. For longevity, ![]()
M) and estimates for U and s of 0.064 and -0.048, respectively. However, due to a discrepancy caused by a single data point (generation 49 for the MA lines), our regression estimate of
M is about two-thirds of VASSILIEVA and LYNCH's (1999), and our BM estimates of U and s are consequently about 2 times smaller and 1.5 times greater, respectively (Table 2). The data file provided to us contains the most meaningful measure of longevity, and furthermore the level of mutational decay for longevity observed up to generation 49 has not been seen in later generations (M. LYNCH and L. VASSILIEVA, personal communication). ML estimates of U and s for longevity are 0.0040 and -0.26, respectively. In terms of mutational target sizes, the conclusion from the two MA experiments taken together is r
productivity > longevity.
Overall, the ML estimates for the two C. elegans MA experiments agree with one another reasonably well (Table 3 and Table 4). Taking an average over traits, estimates of U per haploid genome are 0.0041 (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The ML estimates of the mutation effect parameter are surprisingly high, particularly for productivity and r in the case of VASSILIEVA and LYNCH's (1999) data set, and may be influenced by extreme lines with low mean fitness. Visual inspection of the data suggested this to be the case: a minority of lines had consistently low fitness across several generations. A scatter plot of the rank of the line means for productivity at the last two generations (30 and 49) gives an indication of the extent of contribution of such extreme lines (Fig 2). Over most of the plot, points seem to be distributed at random, suggesting little covariance between rank across generations, but there is a deficit of points at the left-hand and lower edges along with the suggestion of an excess of lines that rank low at both generations. We further investigated the contribution of the low-ranking lines to the U and s estimates by excluding subsets of extreme lines with mean phenotype <50% of the control population mean. [cf. ![]()
|
Nature of mutational variability for life-history traits in C. elegans:
Taking an average over traits, the ML analysis of VASSILIEVA and LYNCH's (1999) data provides an estimate for U more than five times smaller than the corresponding average BM estimate. By the criterion of comparing Drosophila melanogaster and C. elegans on the basis of the sizes of their genomes (measured by the number of base pairs), and taking into account the lower number of germ line cell divisions in C. elegans than D. melanogaster, ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The estimates for U we have obtained for C. elegans are extremely small. However, most of the analysis here has assumed that mutations have equal effects and produces an estimate of an "effective" number of mutations similar to the effective number of loci influencing a quantitative trait that can be estimated from line crosses (![]()
![]()
![]()
![]()
| ACKNOWLEDGMENTS |
|---|
We thank Mike Lynch and Larissa Vassilieva for kindly providing the data from their mutation-accumulation experiment. We are grateful to Deborah Charlesworth, Esther Davies, Bruno Goffinet, Bill Hill, Armando Caballero, Mark Kirkpatrick, Brian Charlesworth, Ruth Shaw, Dan Schoen, Mike Lynch, and an anonymous referee for helpful comments on the manuscript. T.B. acknowledges support from the European Science Foundation (Plant Adaptation Program) and the French Institut National de la Recherche Agronomique. P.K. acknowledges support from the Royal Society of London.
Manuscript received February 22, 1999; Accepted for publication November 23, 1999.
| LITERATURE CITED |
|---|
ASHBURNER, M., S. MISRA, J. ROOTE, S. E. LEWIS, and R. BLAZEJ et al., 1999 An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region. Genetics 153:179-219
BATEMAN, A. J., 1959 The viability of near-normal irradiated chromosomes. Int. J. Radiat. Biol. 1:170-180.
CHARLESWORTH, B., 1994 Evolution in Age Structured Populations, Ed. 2. Cambridge University Press, Cambridge, United Kingdom.
CROW, J. F., and M. J. SIMMONS, 1983 The mutation load in Drosophila, pp. 135 in The Genetics and Biology of Drosophila, Vol. 3C, edited by M. ASHBURNER, H. L. CARSON and J. N. THOMPSON. Academic Press, London.
DAVIES, E. K., A. D. PETERS, and P. D. KEIGHTLEY, 1999 High frequency of cryptic deleterious mutations in Caenorhadbitis elegans.. Science 285:1748-1751
DENG, H.-W. and Y.-X. FU, 1998 On the three methods for estimating deleterious genomic mutation parameters. Genet. Res. 71:223-236[Medline].
DENG, H.-W., J. LI, and J.-L. LI, 1999 On the experimental design and data analysis of mutation accumulation experiments. Genet. Res. 73:147-164[Medline].
EIDE, D. and P. ANDERSON, 1985 The gene structures of spontaneous mutations affecting a Caenorhabditis elegans myosin heavy-chain gene. Genetics 109:67-79
FALCONER, D. S., and T. F. C. MACKAY, 1996 Introduction to Quantitative Genetics, Ed. 4. Longman Scientific and Technical, Harlow, Essex, United Kingdom.
FRY, J. D., P. D. KEIGHTLEY, S. L. HEINSOHN, and S. V. NUZHDIN, 1999 New estimates of the rates and effects of mildly deleterious mutation in Drosophila melanogaster.. Proc. Natl. Acad. Sci. USA 96:574-579
GARCIA-DORADO, A., 1997 The rate and effects distribution of viability mutation in Drosophila: minimum distance estimation. Evolution 51:1130-1139.
GARCIA-DORADO, A., C. LOPEZ-FANJUL, and A. CABALLERO, 1999 Properties of spontaneous mutations affecting quantitative traits. Genet. Res. in press.
GENSTAT 5 COMMITTEE, 1993 Genstat 5 Release 3 Reference Manual. Clarendon Press, Oxford.
JOHNSON, T. E. and E. W. HUTCHINSON, 1993 Absence of strong heterosis for life span and other life history traits in Caenorhabditis elegans.. Genetics 134:465-474[Abstract].
KEIGHTLEY, P. D., 1994 The distribution of mutation effects on viability in Drosophila melanogaster.. Genetics 138:1315-1322[Abstract].
KEIGHTLEY, P. D., 1996 Nature of deleterious mutation load in Drosophila.. Genetics 144:1993-1999[Abstract].
KEIGHTLEY, P. D., 1998 Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 150:1283-1293
KEIGHTLEY, P. D. and A. CABALLERO, 1997 Genomic mutation rates for lifetime reproductive output and lifespan in Caenorhabditis elegans.. Proc. Natl. Acad. Sci. USA 94:3823-3827
KEIGHTLEY, P. D. and A. EYRE-WALKER, 1999 Terumi Mukai and the riddle of deleterious mutation rates. Genetics 153:515-523
KIBOTA, T. T. and M. LYNCH, 1996 Estimate of the genomic mutation rate deleterious to overall fitness in E. coli.. Nature 381:694-696[Medline].
LANDE, R., 1995 Mutation and conservation. Conserv. Biol. 9:782-791.
LYNCH, M. and W. G. HILL, 1986 Phenotypic evolution and neutral mutation. Evolution 49:915-935.
LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.
LYNCH, M., J. CONERY, and R. BURGER, 1995 Mutation accumulation and the extinction of small populations. Am. Nat. 146:489-581.
LYNCH, M., J. BLANCHARD, D. HOULE, T. KIBOTA, and S. SCHULTZ et al., 1999 Perspective: spontaneous deleterious mutation. Evolution 53:645-663.
MUKAI, T., 1964 The genetic structure of natural populations of Drosophila melanogaster. I. Spontaneous mutation rate of polygenes controlling viability. Genetics 50:1-19
MUKAI, T., S. I. CHIGUSA, L. E. METTLER, and J. F. CROW, 1972 Mutation rate and dominance of genes affecting viability in Drosophila melanogaster.. Genetics 72:333-355.
NELDER, J. A. and R. MEAD, 1965 A simplex method for function minimization. Comput. J. 7:308-313.
OHNISHI, O., 1977 Spontaneous and ethyl methanesulfonate-induced mutations controlling viability in Drosophila melanogaster. II. Homozygous effect of polygenic mutations. Genetics 87:529-545
SIMMEN, M. W., S. LEITGEB, V. H. CLARK, S. J. M. JONES, and A. BIRD, 1998 Gene number in an invertebrate chordate, Ciona intestinalis.. Proc. Natl. Acad. Sci. USA 95:4437-4440
SIMMONS, M. J. and J. F. CROW, 1977 Mutations affecting fitness in Drosophila populations. Annu. Rev. Genet. 11:49-78[Medline].
SOKAL, R. R., and F. J. ROHLF, 1995 Biometry, Ed. 3. W. H. Freeman, New York.
VASSILIEVA, L. and M. LYNCH, 1999 The rate of spontaneous mutation for life-history traits in Caenorhabditis elegans.. Genetics 151:119-129
WEIR, B. S., 1996 Genetic Data Analysis II. Sinauer, Sunderland, MA.
This article has been cited by other articles:
![]() |
D. Ostrow, N. Phillips, A. Avalos, D. Blanton, A. Boggs, T. Keller, L. Levy, J. Rosenbloom, and C. F. Baer Mutational Bias for Body Size in Rhabditid Nematodes Genetics, July 1, 2007; 176(3): 1653 - 1661. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sanjuan, A. Moya, and S. F. Elena The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus PNAS, June 1, 2004; 101(22): 8396 - 8401. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Garcia-Dorado and A. Gallego Comparing Analysis Methods for Mutation-Accumulation Data: A Simulation Study Genetics, June 1, 2003; 164(2): 807 - 819. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Cutter and B. A. Payseur Selection at Linked Sites in the Partial Selfer Caenorhabditis elegans Mol. Biol. Evol., May 1, 2003; 20(5): 665 - 673. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. B. R. Azevedo, P. D. Keightley, C. Lauren-Maatta, L. L. Vassilieva, M. Lynch, and A. M. Leroi Spontaneous Mutational Variation for Body Size in Caenorhabditis elegans Genetics, October 1, 2002; 162(2): 755 - 765. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Chavarrias, C. Lopez-Fanjul, and A. Garcia-Dorado The Rate of Mutation and the Homozygous and Heterozygous Mutational Effects for Competitive Viability: A Long-Term Experiment With Drosophila melanogaster Genetics, June 1, 2001; 158(2): 681 - 693. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Peters and P. D. Keightley A Test for Epistasis Among Induced Mutations in Caenorhabditis elegans Genetics, December 1, 2000; 156(4): 1635 - 1647. [Abstract] [Full Text] |
||||
![]() |
P. D. Keightley, E. K. Davies, A. D. Peters, and R. G. Shaw Properties of Ethylmethane Sulfonate-Induced Mutations Affecting Life-History Traits in Caenorhabditis elegans and Inferences About Bivariate Distributions of Mutation Effects Genetics, September 1, 2000; 156(1): 143 - 154. [Abstract] [Full Text] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Keightley, P. D.
- Articles by Bataillon, T. M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Keightley, P. D.
- Articles by Bataillon, T. M.







assuming gamma distributions of mutation effects


