Genetics, Vol. 153, 825-835, October 1999, Copyright © 1999

The Genetic Analysis of Age-Dependent Traits: Modeling the Character Process

Scott D. Pletchera,b and Charles J. Geyerb
a Department of Ecology, Evolution and Behavior, University of Minnesota, Saint Paul, Minnesota 55108
b School of Statistics, University of Minnesota, Saint Paul, Minnesota 55108

Corresponding author: Scott D. Pletcher, Max Planck Institute for Demographic Research, Doberaner Str. 114, D-18057 Rostock, Germany., pletcher{at}demogr.mpg.de (E-mail)

Communicating editor: A. G. CLARK


*  ABSTRACT
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

The extension of classical quantitative genetics to deal with function-valued characters (also called infinite-dimensional characters) such as growth curves, mortality curves, and reaction norms, was begun by Kirkpatrick and co-workers. In this theory, the analogs of variance components for single traits are covariance functions for function-valued traits. In the approach presented here, we employ a variety of parametric models for covariance functions that have a number of desirable properties: the functions (1) are positive definite, (2) can be estimated using procedures like those currently used for single traits, (3) have a small number of parameters, and (4) allow simple hypotheses to be easily tested. The methods are illustrated using data from a large experiment that examined the effects of spontaneous mutations on age-specific mortality rates in Drosophila melanogaster. Our methods are shown to work better than a standard multivariate analysis, which assumes the character value at each age is a distinct character. Advantages over existing methods that model covariance functions as a series of orthogonal polynomials are discussed.


SINCE the introduction of quantitative genetics theory and methods to the study of evolution, a tremendous body of literature has developed, documenting patterns of quantitative genetic variation within and between species for a wide variety of continuous characters (BARTON and TURELLI 1989 Down; FALCONER 1989 Down; LYNCH and WALSH 1998 Down). Evolutionary biologists use this information to predict how a population might respond to natural or artificial selection and to provide insight into the contributions of the various evolutionary processes to the levels of genetic variation seen in natural populations (LANDE 1979 Down, LANDE 1982 Down; HOULE 1992 Down). Empirical estimates of genetic variances in single traits and genetic covariances between traits have contributed greatly to our knowledge of the evolution of biological characters.

Classical quantitative genetics theory covers the analysis of a single quantitative trait, such as bristle number in Drosophila, or at most a few traits. However, many interesting characters are inherently too complex to be described by classical theory. Most often this is because it is difficult to describe the character of interest by a single value. Examples can be found in the field of life history evolution, where traits change over the lifetime of an individual. In fact, in many cases it is the change of the character with age that is the primary interest (HUGHES and CHARLESWORTH 1994 Down; PROMISLOW et al. 1996 Down; PLETCHER et al. 1998 Down).

Function-valued traits are characters that change as a function of some independent and continuous variable. More specifically, a function-valued trait is a function x(t). In all of the work that has been done on function-valued traits, including ours, both the independent variable t and the dependent variable x(t) are single valued. These traits have also been called infinite-dimensional traits (KIRKPATRICK and HECKMAN 1989 Down) because the character can take on a value at an infinite number of ages. In principle, there is no reason why our methods or those of other workers in this area cannot be extended to allow t or x(t) or both to be multivariate. For the case of univariate t and x(t), we think "function valued" is the more descriptive term. It avoids confusion with characters that are described by a multidimensional t or x(t). For specificity, we always refer to the independent variable t as time or age, although there is no reason why it cannot be any continuous variable.

In cases where the functional nature of the trait is of interest, classical methods are often employed by treating arbitrary, discrete age intervals as unique characters in a multivariate analysis (HUGHES and CHARLESWORTH 1994 Down; PROMISLOW et al. 1996 Down; TATAR et al. 1996 Down; PLETCHER et al. 1998 Down). This approach is problematic. As the number of ages of interest increases the ability to produce precise estimates of statistical parameters is rapidly lost (SHAW 1987 Down, SHAW 1991 Down). In addition, when measurements are taken at irregular intervals, one might reasonably expect the trait to be more similar between ages separated by a short time as compared with more disparate ages. A standard variance component analysis ignores this type of information.

Recognizing the limits of the classical approach, KIRKPATRICK and HECKMAN 1989 Down formulated a quantitative genetic model for function-valued traits, which has since served as the foundation for numerous theoretical and experimental investigations in this area. On the theoretical side, age-specific selection on a character and its interactions with genetic constraints have received considerable attention (KIRKPATRICK et al. 1990 Down; KIRKPATRICK and LOFSVOLD 1992 Down). The evolution of reaction norms over continuous environments has also been studied (GOMULKIEWICZ and KIRKPATRICK 1992 Down). On the experimental side, estimates of genetic variation for age-dependent growth patterns in birds (GEBHARDT-HENRICH and MARKS 1993 Down; BJORKLUND 1997 Down), mice (KIRKPATRICK et al. 1990 Down; MEYER and HILL 1997 Down), and livestock (KIRKPATRICK 1997 Down) have been published. Moreover, the recent interest in age-specific components of genetic variation for other life-history characters (ENGSTROM et al. 1989 Down; HOULE et al. 1994 Down; HUGHES and CHARLESWORTH 1994 Down; PROMISLOW et al. 1996 Down; PLETCHER et al. 1998 Down) suggests that interest in function-valued traits is growing.

A quantitative genetics theory for function-valued traits is a straightforward extension to standard methodology. Classical quantitative genetics partitions an observable trait as

(1)

where µ is the mean (fixed effect) and g and e are the genetic and environmental components (random effects). Assuming no gene-environment interaction, g and e are independent, hence

If xi, etc. denote the effects for individual i, the simplest assumptions are that ei and ej are uncorrelated if i != j and that Cov(gi, gj) is proportional to the coefficient of relationship of i and j (FALCONER 1989 Down, pp. 111 ff., especially p. 156). Making a matrix A of the coefficients of relationship (the so-called numerator relationship matrix) allows us to write the matrix equation

where I is the identity matrix and {sigma}2g and {sigma}2e are two parameters to be estimated, the genetic and environmental variances.

More complex genetic models partition the genetic effect into additive, dominance, and other effects (LYNCH and WALSH 1998 Down). All the theory and examples in this article consider only additive models. Extension of our methods to include dominance and other effects is theoretically straightforward (though no doubt some practical difficulties will arise).

When more than one trait is modeled, we have covariances among traits as well as among individuals (SHAW 1987 Down, SHAW 1991 Down). If xij, etc., now denote the effects for individual i and trait j, the simplest assumptions are now

(2a)

where {delta}ik are the elements of the identity matrix ({delta}ik = 1 if i = k, and {delta}ik = 0 otherwise), and

(2b)

where the rij are the coefficients of relationship (elements of the A matrix) and the {gamma}jl and {epsilon}jl are parameters to be estimated. Making matrices G and E with elements {gamma}jl and {epsilon}jl allows us to write the matrix equation

(3)

where {otimes} denotes the Kronecker product of matrices (SEARLE et al. 1992 Down, pp. 443 ff.) and x is a vector containing all data on all individuals in the order x11, x12, ... , x21, x22, ... The matrices G and E are symmetric m x m matrices if there are m traits, and each has m(m + 1)/2 independent parameters. Statistical inference about the G matrix and the constraints it imposes on the dynamics of phenotypic evolution is the primary interest in these analyses (LANDE 1979 Down, LANDE 1982 Down).

Function-valued traits add an additional level of complication. Now for individual i the trait is a function xi(t) of the continuous variable t. Equation 2a and Equation 2b are replaced by

(4a)


(4b)

The primary interest in analyses of function-valued traits is statistical inference about the "G function," G(s, t), also called the additive genetic covariance function. The "E function," E(s, t), also called the environmental covariance function, is of lesser interest.

In practice, data are only observed at a finite set of times t1, ... , tm, rather than a continuum, so we have only a finite set of data on each individual, which we can consider as a multivariate trait vector xi(t1), ... , xi(tm). Although in theory the trait has a continuous G function, in practice the covariance structure is described by a "G matrix." The elements of the G matrix are genetic covariances between the trait measured at different ages. The key idea here is that the elements of the G matrix do not consist of unique parameters for all variances and covariances. Instead, all elements of this matrix are obtained from a single G function. Thus, the finite dimensional G matrix for the character process model has elements defined by {gamma}jl = G(tj, tl). A similar argument applies for the "E matrix." Given the new parameterization of the G and E matrices, Equation 3 again describes the variance of the observed phenotype considered as a multivariate trait vector xi(tj).

Is that all there is to function-valued traits? It appears as though we have simply redefined the problem. Although in principle there is a G function G(s, t), in practice there is only a G matrix G(tj, tl). Is anything new introduced by talking about function-valued traits? The answer is "yes," because classical multivariate methods run into intractable difficulties when there are many traits. Even five traits are trouble (SHAW et al. 1995 Down; SHAW and GEYER 1997 Down). Function-valued traits are often observed at many times (or many values of t if t is not time), too many for classical multivariate quantitative genetics to cope with.

Some new idea has to be added to manage the parameter explosion, m(m + 1) parameters to estimate in the genetic covariance matrix alone if data are observed at m times. In the theory of function-valued characters, the number of parameters in the finite dimensional G matrix is equal to the number of parameters in the G function—this is independent of the number of ages examined, and the task is to model and estimate the G function. There are two possible approaches: parametric and nonparametric. This article explores the use of parametric models for the G function. Kirkpatrick and co-workers and followers use an approach that is nonparametric in spirit, although for most experimental designs it is missing some important features that one expects in a nonparametric statistical method.

In the following sections we provide a brief review of the seminal work in this area, while focusing on the differences between previous work and our own. We present representative examples from an extensive series of simulations in which we compared our approach with those suggested previously. We then illustrate the various techniques using real data on mortality rates in female Drosophila. Last, we summarize some of the benefits of our character process model over previous methods and suggest promising avenues for future theoretical development.


*  GENERAL CONSIDERATIONS
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

The probabilistic framework for modeling a function-valued trait is based on the theories of stochastic processes. A stochastic process can be defined as a set of random variables X(t), t {isin} T, where T is a subset of the real line and termed the time parameter set (HOEL et al. 1972 Down). A specific realization of a process (i.e., the values of the random variables at each t) is called a sample path of that process. We are interested in processes with finite variance, i.e., for which E{X(t)2} < {infty}, the so-called second-order processes. In such cases, we can define a mean function of the process by

(5)

and a covariance function of the process by

(6)

Equation 5 is the function describing how the expected value of the character changes with age, and (6) describes the covariance between the character at two separate ages. The covariance function must be nonnegative definite, that is, for any finite set of times (t1 ... tN) and any real numbers (b1 ... bN),

(7)

Most quantitative genetics theory is based on the assumption that the character of interest or some transformation of it is normally distributed (LYNCH and WALSH 1998 Down). This assumption can be extended to a character process by utilizing the theory of Gaussian processes (HOEL et al. 1972 Down; KIRKPATRICK and HECKMAN 1989 Down). A stochastic process X(t), t {isin} T, is called a Gaussian process if the vector (X(t1), X(t2), ... , X(tm)) has a multivariate normal distribution for every choice of times t1, ... , tm (HOEL et al. 1972 Down). As with any Gaussian random variable, the distribution of a Gaussian process is completely determined by its mean and covariance function.

Using the language of Gaussian processes, we can now complete our description of quantitative genetics for function-valued traits. We assume the observed phenotypic character process X(t) is a Gaussian process and can be decomposed analogous to (1) as

(8)

where µ(t) is a nonrandom function, the mean function of X(t), and g(t) and e(t) are mean-zero Gaussian processes that are independent of each other and have covariance functions G(s, t) and E(s, t), respectively. By the independence of g(t) and e(t), the covariance function of X(t) is given by P(s, t) as

(9)

Each individual has a different realization of the character processes X(t), g(t), and e(t). The covariance of the processes for different individuals we have already derived as (4a) and (4b).

Thus the character process approach, also called function-valued quantitative genetics, can be simply but briefly described as replacing the Gaussian random variables or random vectors of classical quantitative genetics by Gaussian stochastic processes and proceeding mutatis mutandis. What we have described so far includes all approaches to function-valued quantitative genetics: that of Kirkpatrick and co-workers, that of MEYER and HILL 1997 Down, and ours. The differences are in how the G and E functions are modeled and in how the models are fitted to data.


*  NONPARAMETRICS AND ORTHOGONAL POLYNOMIALS
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

In the approaches of Kirkpatrick and co-workers and of Meyer and Hill, the G and E functions are modeled by a linear combination of orthogonal Legendre polynomials

(10)

where G is the covariance function, m determines the number of polynomial terms used in the model, kij are unknown parameters to be estimated (the coefficients of the linear combination), and {phi}i is the ith Legendre polynomial (KIRKPATRICK and HECKMAN 1989 Down; KIRKPATRICK et al. 1990 Down). A similar model is used for the E function.

Kirkpatrick and co-workers used fitting procedures that are no longer recommended, being superseded by the methods of MEYER and HILL 1997 Down, who used restricted maximum likelihood (REML). Meyer and Hill estimated the parameters of the model (i.e., the kij in Equation 10) for each model with a fixed set of Legendre polynomials, which corresponds to fixing m in (10). They then used likelihood-ratio tests to determine a value of m that adequately fits the data.

We have no argument with model fitting by maximum likelihood (ML) or REML, but we propose a different way of modeling G and E functions. Covariance functions modeled with Legendre polynomials (or other orthogonal polynomials) have a number of potential drawbacks.

  1. They are not automatically positive semidefinite. Although constrained ML or REML can be used to impose this condition, this greatly complicates hypothesis testing and other statistical procedures.

  2. Legendre polynomials have no theoretical justification other than being one among many sets of orthogonal basis functions.

  3. Polynomials do not fit covariance functions well. Polynomials of high degree are extremely "wiggly" and do not have asymptotes. Sensible covariance functions are extremely smooth and typically

    (an asymptote).

  4. For the majority of genetic studies, trying to be nonparametric about the covariance function of an unobservable stochastic process may be optimistic. In time-series analysis and spatial statistics, where the stochastic process is observed directly, the most successful methods use parametric models [e.g., autoregressive integrated moving average (ARIMA) modeling of time series and variogram estimation in spatial statistics]. Experience in spatial statistics shows that the behavior of the covariance function at points closely related in time determines most of the behavior of the process, and it is difficult to distinguish different behaviors in the tails of the covariance function (CRESSIE 1993 Down, section 3.2.1). It is even more difficult if the stochastic process is unobserved like the genetic and environmental processes in quantitative genetics. For realistic experimental designs, there is not enough information in the data for good nonparametric estimation.

  5. Polynomial models for covariance functions often have a large number of parameters, most of which have no simple interpretation. Specific age-dependent hypotheses are not easily tested.

We avoid these problems by using parametric models for the G and E functions. We discuss a large family of parametric models, each with a small number of interpretable parameters, that satisfy theoretical requirements and that as a group exhibit a wide variety of behaviors. We (like MEYER and HILL 1997 Down) use ML to estimate parameters. C code, implementing these procedures, is available from the first author.


*  PARAMETRIC CHARACTER PROCESS MODELS
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

Useful parametric models for covariance functions are limited by several theoretical requirements. First, covariance functions must be positive semidefinite, i.e., satisfy Equation 7. Second, biological processes are expected to be reasonably smooth, requiring their covariance functions to be smooth as well (HOEL et al. 1972 Down). If a Gaussian stochastic process is to be considered smooth, it will have differentiable sample paths, and so must its covariance function. In general the covariance function has twice as many derivatives as the process itself (HOEL et al. 1972 Down). Thus, because we expect biological processes to be relatively smooth, we choose covariance function models that are highly differentiable. Third, it is desirable for the covariance function to have parameters with biologically meaningful interpretations so that interesting hypotheses can be easily tested.

With these considerations in mind, we first concentrate on a simple model of a character process that nevertheless may adequately represent many age-dependent traits. We assume each process X(t) is second-order stationary, which means

(HOEL et al. 1972 Down). This stationarity assumption is necessary for several fundamental results, but it is relaxed later. Second-order stationarity requires that the mean value of the trait must not change with age and that the covariance between the value of the character at two different ages depends only on the time distance between the age classes.

For stationary models, the choice of a covariance function is greatly simplified by Bochner's theorem (HOEL et al. 1972 Down), which asserts that a strictly positive covariance function is necessarily proportional to the characteristic function of some probability distribution. Thus, immediately we have a long menu of potential covariance functions from which to choose, as any real-valued characteristic function of a probability distribution is allowed. A number of satisfactory functions are presented in Table 1. In many cases the characteristic function of one probability distribution is proportional to the probability density function of another. In such cases we refer to the covariance function by the name of the distribution with the proportional density. In cases where there is no such distribution, the covariance function is specifically referred to as the characteristic function of its parent distribution. The available functions exhibit a wide variety of behaviors, and some can be negative in sign.


 
View this table:
In this window
In a new window

 
Table 1. Covariance functions for the character process model

Although the assumptions of stationarity are rather strict, we can use the results for stationary processes to formulate models that account for age-dependent changes in the mean value of the character and that allow for more general covariance functions. The simplest way to achieve first-order stationarity (i.e., a constant mean over time) is to model the mean separately as in (8), where g(t) and e(t) have mean zero for all t, hence are first-order stationary. The nonstochastic function µ(t), analogous to fixed effects in classical quantitative genetics, models the mean behavior. An alternative to modeling the mean function directly is to use methods analogous to those used to remove trends in time series (BOX et al. 1994 Down), such as differencing the series (replacing the value at time t by Xt+1 - Xt), and more generally using "integrated" models, such as ARIMA.

A relaxation of second-order stationarity—the condition that requires the covariance of the process between ages t1 and t2 to be only a function of |t1 - t2|—that still gives relatively simple models is second-order correlation (rather than covariance) stationarity. This relaxation allows variance to change with age. If {rho}X(s - t) is the correlation function of a second-order stationary process and v(t) is an arbitrary function, then

(11)

is a valid covariance function. Thus we can choose {rho}X(t) to be any of the functions in Table 1 with the additional restriction that {theta}0 = 1 [so that the correlation of X(t) with itself is 1] and choose v(t) completely arbitrarily and still obtain a reasonable model. Although the model has stationary correlation, the variance

is not stationary and can be specified as we please. Hypotheses concerning the pattern of change in age-specific variances (genetic and otherwise) for a given character can be examined using this model.

The parameters of the model are estimated straightforwardly using ML or REML. The reason, as mentioned in the Introduction, is that the character process is only observed at a finite set of times; hence the observations form a multivariate normal random vector with mean and covariance that are specified by the models for the mean function and G and E covariance functions. In principle the estimation procedure is no different from classical quantitative genetics of multivariate traits. Only the model specification is new. In practice, however, the ideas of the character process model use reasonable assumptions to reduce the dimension of the parameter space and make an age-dependent quantitative analysis of the trait possible.


*  EXAMPLES
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

Simulation study:
We investigated the behavior of the character process and orthogonal polynomial (OP) models through extensive simulations. Three representative examples are provided in this section. For each example, a single data set was generated assuming a standard half-sib design (LYNCH and WALSH 1998 Down) in which 20 sires were each mated to three dams and three progeny were measured from each dam. We assumed the character of interest was measured at 10 regularly spaced ages denoted 1, ... , 10. It is important to note that such a balanced design is not required for applying these methods. Unequal family structure, as well as irregularly spaced measurements, are perfectly acceptable, although different designs will contain different amounts of genetic information (SHAW 1987 Down). Details of the simulation procedure are available from the first author.

Because they are unobserved, we have no way of knowing what a typical genetic covariance function might look like. Therefore, these examples are rather arbitrary and serve mainly to illustrate the relationship between the character process and OP models. We present three relatively simple cases: case I, genetic variance is constant across all ages, and genetic covariance declines very quickly between adjacent ages; case II, genetic variance is constant across all ages, and genetic correlation declines very slowly; case III, the genetic covariance function is composed of four OPs (giving a covariance function of degree three).

Fig 1 Fig 2 Fig 3 present the actual covariance functions for each of the three cases along with contour plots describing the fit of different models to the simulated data. The contour plots display the absolute difference between the fitted surface and the actual surface, with darker regions indicating regions of poor fit and lighter regions indicating regions of better fit. Contour shading is constant over all figures, allowing comparisons between them.



View larger version (60K):
In this window
In a new window
Download PPT slide
 
Figure 1. (A) Actual genetic covariance surface for simulated data from case I: constant genetic variance and rapidly declining covariance. The form of the covariance function is G(t1, t2) = 0.5e-0.7(t1-t2)2. (B) Lack of fit of an estimated genetic covariance surface for a model consisting of five orthogonal polynomials. Lack of fit is defined as the absolute difference between the estimated surface and the actual surface. Darker regions indicate greater lack of fit.



View larger version (53K):
In this window
In a new window
Download PPT slide
 
Figure 2. (A) Actual genetic covariance surface for simulated data from case II: constant genetic variance and slowly declining covariance. The form of the covariance function is G(t1, t2) = 0.5e-0.01(t1-t2)2. (B) Lack of fit of an estimated genetic covariance surface for a model consisting of three orthogonal polynomials. Lack of fit is defined as the absolute difference between the estimated surface and the actual surface. Darker regions indicate greater lack of fit.



View larger version (41K):
In this window
In a new window
Download PPT slide
 
Figure 3. (A) Actual genetic covariance surface for simulated data from case III: genetic covariance function based on four orthogonal polynomials. (B) Lack of fit of an estimated genetic covariance surface using a character process model with a linear variance and normal correlation function. (C) Lack of fit of an estimated genetic covariance surface for a model consisting of four orthogonal polynomials (the same form used to generate the data). For both B and C, lack of fit is defined as the absolute difference between the estimated surface and the actual surface. Darker regions indicate greater lack of fit.

When character values at different ages are genetically uncorrelated (or nearly so), OP models provide a poor estimate of the covariance function (Fig 1). The five-polynomial model was determined to provide an adequate fit to the data via likelihood-ratio tests (a six-polynomial model did not fit significantly better), and although the fit is quite poor, genetic variances are estimated more accurately than covariances (Fig 1B). In our experience this is to be expected when covariances decline asymptotically toward zero within the range of the data. The wiggly nature of the polynomial model has difficulty reproducing such a structure. The OP model does a much better job of describing the covariance structure when genetic correlations are high between all ages in the data (Fig 2). In this case, the three-OP model was determined as the best fit, and it does a reasonable job of estimating the covariance structure. The fits of the character process models are not presented for these two examples. They are expected to fit well (and do) because they were used to generate the data.

Fig 3 presents a genetic covariance function generated directly from a four-OP model. In this case, it was the character process model (a linear variance model with normal correlation) that had trouble capturing the structure of the genetic covariances. Nevertheless, the fit of the character process model is not terrible, and essentially smooths over the undulations in the actual function. Surprisingly, the OP model has some difficulty reproducing the covariance structure. This is likely due to the number of parameters in the model (10) and the size of the simulated experiment. Even when the form of the underlying covariance function is known precisely, most experiments will not provide enough information to accurately estimate even a moderate number of parameters.

In summary, OP models do not accurately describe the structure of the genetic covariance function when the genetic correlation is expected to decline significantly with age. We argued (see above) that it is these types of covariance functions that one might expect from natural stochastic processes. For relatively simple covariance structures, however, the OP models accurately estimate the surfaces (Fig 2). Flexibility from the range of allowable character process models allows a reasonable approximation to the actual covariance structure even when it is very irregular (Fig 3). Moreover, Fig 1 Fig 2 Fig 3 suggest that a significant strength of the character process model is its separation of variance functions from correlation functions. In all the examples, the majority of lack of fit is in the covariance (not variance) structure, suggesting the overall fit of the model is determined primarily by estimates of age-specific variances.

Age-specific mortality rates in Drosophila:
In this example, our goal is to estimate the genetic covariance structure for age-specific mortality rates in lines of Drosophila melanogaster allowed to accumulate spontaneous mutations for 19 generations (PLETCHER et al. 1998 Down). The data are mortality rate estimates (5-day intervals) for 29 mutation-accumulation lines. For each accumulation line there are four mortality observations at each age, and mortality rates are presented for six different ages. A logarithmic transformation was used to normalize the data (PROMISLOW et al. 1996 Down; PLETCHER et al. 1998 Down). In this example, log-mortality rates are examined through age 30 days posteclosion. Data from the oldest ages were excluded because estimates of genetic variances and covariances among these ages were extremely imprecise when estimated using standard methods, and often this hindered our ability to compare estimation methods. Estimates of the mutational covariance structure based on the complete data set are presented in a companion article (PLETCHER et al. 1999 Down).

The data set was analyzed using three approaches. First, the genetic covariance structure was estimated completely nonparametrically (i.e., using standard multivariate techniques) by specifying a separate parameter for each age-specific variance and each covariance. Our sample size was far too small to estimate all 21 parameters in the 6 x 6 covariance matrix simultaneously, and we were forced to construct the matrix piecewise—by examining ages two at a time. Pairwise covariances were obtained using ML implemented in the program QUERCUS (SHAW 1987 Down; SHAW and SHAW 1992 Down). Second, a genetic covariance function composed of four Legendre polynomials (giving a polynomial of degree three) was estimated using ML procedures similar to those of MEYER and HILL 1997 Down. Third, we used the character process approach to estimate a genetic covariance function based on a quadratic variance function and normal correlation function (see Table 1).

The estimated genetic covariance matrices for the various methods are presented in Table 2. Although all procedures appear to capture the dominant aspects of the covariance structure, several issues make the character process approach desirable. First, using standard multivariate methods, covariances and their asymptotic standard errors were estimated pairwise and are too small when considering the matrix as a whole. Despite the small standard errors there is insufficient statistical power to detect a significant change in covariance as ages become further separated in time (analysis not shown). Second, because data from each age are considered separately, systematic relationships among the characters are ignored. Third, the sample size prohibits estimating the entire 6 x 6 covariance matrix simultaneously, and as a result the "piecewise" matrix (Table 2) is not even positive definite.


 
View this table:
In this window
In a new window

 
Table 2. Comparison of age-specific genetic variance matrices estimated by various methods

The genetic matrix produced by the four-polynomial model is quite similar to that produced by the standard methods. However, a primary concern remains the number of parameters in the model; we are estimating 10 parameters for the genetic matrix alone. As with the standard methods, the number of parameters demands a large sample size for accurate estimation, but unlike these methods, none of the parameters have a clear interpretation. Although we may have asymptotic variance estimates for the coefficients of the OP (as is the case when ML is used), it is difficult to establish simple tests of interesting hypotheses. For example, the rate of decline in covariance as ages become further separated in time is described by a complicated combination of the coefficients of the polynomial.

Many of the problems inherent in the standard and OP methods are alleviated under the character process model. The estimated genetic covariance functions are guaranteed to be positive definite, and data from all ages are analyzed simultaneously. Standard errors for the parameters of the model are obtained from the maximization procedure and error estimates on the individual age measures can be easily calculated. Most covariance functions have relatively few parameters, which are estimated with high precision. Finally, and perhaps most importantly, the parameters of the model have useful interpretations, which allow simple hypotheses to be easily tested.

To further investigate the behavior of the character process models, we fit several different covariance functions to the data. In all models, we estimated a nonparametric mean function—average mortality rates at each age were estimated simultaneously—to account for the increase in mortality rates with age. For both the genetic and environmental effects, we examined the fit of covariance functions composed of (in all combinations) three variance functions, the v(t)2 from Equation 11 (constant, linear, and quadratic) and three correlation functions, the {rho}X(s - t) from Equation 11 (normal, Cauchy, and characteristic function of a uniform). For all analyses the constant variance and Cauchy correlation functions were chosen for modeling the environmental covariance—more complicated covariance functions did not provide a significantly better fit (details not shown).

Parameter estimates for the genetic covariance functions are given in Table 3. The dynamics of age-specific genetic variance can be determined using likelihood-ratio tests. Given a specific correlation function, twice the difference in log likelihoods between a more general variance model (e.g., quadratic variance) and a more constrained model (e.g., linear variance) has a chi-square distribution with degrees of freedom equal to the number of additional parameters in the more general model. The P-values for the test that a quadratic variance function fits better than a linear one are 0.01 for the normal correlation function, 0.06 for the Cauchy, and 0.05 for the characteristic function of the uniform (the deviances being 6.3, 3.6, and 3.96, respectively, all asymptotically chi-square on 1 d.f.). A cubic variance function did not provide a significantly better fit to the data.


 
View this table:
In this window
In a new window

 
Table 3. Character process model estimates for genetic covariance functions

Given a particular model for the variance function, there is little difference between the fits of the correlation functions. For example, the log-likelihood values for the normal, Cauchy, and uniform correlation functions with a quadratic variance function are -73.14, -74.71, and -74.03, respectively. Although a rigorous test of non-nested hypotheses such as these is rather complicated (see COX 1961 Down, COX 1962 Down), it is clear that there is little statistical power to detect subtle differences in the shapes of the underlying genetic correlation function.

Hypothesis tests concerning age-specific genetic variance for mortality are easily conducted. ML estimates are asymptotically normally distributed, and therefore their estimated standard errors can be used to construct confidence intervals and test statistics (SEARLE et al. 1992 Down). Further, the significantly improved fit of the quadratic variance function over the constant and linear functions provides strong evidence for interesting changes in mutational properties across ages, although the low variance at ages 25–29 days may be driving this result. Such statements could not be made from the results of standard methods or from the fit of OPs.

The hypothesis that most mutations affect mortality equally at all ages can be tested by asking if the correlation in mortality rates between various ages is different from unity. Because, for all character process models, {theta}c (see Table 1) is the rate of decrease in correlation with time, testing whether this value is significantly different from zero directly addresses this hypothesis. The parameter is significantly greater than zero in all models (P < 0.05), providing strong evidence that the majority of measured mutations exhibit some form of age specificity.

Despite the twofold increase in the number of parameters, a covariance function based on four OP did not provide a significantly better fit than the best-fit function from the character process model. Using two popular criteria, Akaike information criterion (AIC) and Bayesian information criterion (BIC; SCHWARZ 1978 Down; STONE 1979 Down), any of the character process models with a quadratic variance function would be chosen over the best OP model (data not shown).


*  DISCUSSION
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

The quantitative genetic analysis of function-valued traits, such as growth and mortality curves, starts with the fundamental recognition by KIRKPATRICK and HECKMAN 1989 Down that the genetic and environmental components of such traits should be modeled as Gaussian stochastic processes. It continues with the recognition by MEYER and HILL 1997 Down that ML or REML can be used to fit such models, just as it can be used for all other quantitative genetics models. Our contribution to the subject is a method of finding valid parametric models for covariance functions of these Gaussian processes from theory in spatial and time-series statistics, where it is widely used (CRESSIE 1993 Down, section 2.5.1).

These parametric models for covariance functions have many virtues. They are assured to be positive definite, hence valid covariance functions. They can be chosen to be highly differentiable, implying the character process itself is smooth, which we expect from a biological process. They have a small number of parameters, and models can be chosen to address specific biological hypotheses. Moreover, the flexibility of the approach means reasonable fits are obtained even when the actual covariance function is highly irregular (Fig 3).

It is important to recognize that parametric models have certain limitations. Although we have argued that our covariance functions are reasonable models, verifying the assumptions of the models, particularly stationarity in correlation, is exceedingly difficult (MATHERON 1988 Down). Stationarity will, however, often be a good approximation; and as George Box asserted, all models are wrong, but some are useful (BOX 1976 Down). Kirkpatrick and colleagues often focus on characterizing the dominant eigenfunctions of the genetic covariance function, which are thought to summarize patterns of genetic variation (KIRKPATRICK et al. 1990 Down). Although we have not pursued it here, it is likely that for a particular covariance function, the eigenfunctions are somewhat limited in their range of behaviors. One may argue, however, that the process of choosing a good model in effect searches a large space of possible eigenfunctions.

Implementing a nonparametric approach using Legendre polynomials (KIRKPATRICK and HECKMAN 1989 Down) is problematic. Subsequent covariance functions are not necessarily positive definite. Simple simulations show that polynomials of low degree do not closely approximate reasonable covariance functions unless character values at all measured ages are highly correlated (Fig 1 and Fig 2). Polynomials of high degree have many parameters, more than are necessary to fit data.

Many of the problems with OPs were recognized by the original authors, and it has been suggested that more advanced "smoothing" techniques, such as cubic-splines or wavelets, might be more well behaved (KIRKPATRICK et al. 1994 Down). This is a promising avenue for future research. Good parametric and nonparametric approaches complement one another. The strengths of the parametric approach are its great efficiency and its ease of interpretation. Unfortunately, if the assumed model is grossly incorrect, inferences can be misleading (SIMONOFF 1996 Down). Good nonparametrics are less reliant on assumptions about the formal structure of the data. They do, however, require large sample sizes, much larger than many of the most ambitious quantitative genetic studies. If there is insufficient information in the data to support the accurate estimation of many parameters, one is essentially left with a bad parametric model.

An equally promising direction for the future might be the extension of our techniques to examine the relationship between multiple character processes. Two-character processes can be examined by estimating co-variance functions for each character and a cross-covariance function between the two (KIRKPATRICK 1988 Down). The approach is analogous to estimating the genetic covariance between two different characters, except in this case the covariance is estimated for the value of the two characters at every combination of the two ages. In this way age-dependent genetic constraints on the independent evolution of the two traits can be explored.


*  ACKNOWLEDGMENTS

Comments provided by J. Curtsinger, R. Shaw, G. Oehlert, R. Lande, M. Kirkpatrick, A. Clark, and an anonymous reviewer greatly improved the quality and clarity of the manuscript. M. Kirkpatrick generously provided creative discussion throughout the development of this work. This work was supported by National Institutes of Health grants AG-0871 and Ag-11722 to J. Curtsinger and by the University of Minnesota Graduate School.

Manuscript received March 15, 1999; Accepted for publication June 22, 1999.


*  LITERATURE CITED
*TOP
*ABSTRACT
*GENERAL CONSIDERATIONS
*NONPARAMETRICS AND ORTHOGONAL...
*PARAMETRIC CHARACTER PROCESS...
*EXAMPLES
*DISCUSSION
*LITERATURE CITED

BARTON, N. H. and M. TURELLI, 1989  Evolutionary quantitative genetics: how little do we know? Annu. Rev. Genet. 23:337-370[Medline].

BJORKLUND, M., 1997  Variation in growth in the blue tit (Parus caeruleus). J. Evol. Biol. 10:139-155.

BOX, G. E. P., 1976  Science and statistics. J. Am. Stat. Assoc. 71:791-802.

BOX, G. E. P., G. JENKINS and G. C. REINSEL, 1994 Time Series Analysis: Forecasting and Control, Ed. 3. Prentice Hall, Englewood Cliffs, NJ.

COX, D. R., 1961  Tests of separate families of hypotheses. Proc. 4th Berkeley Symp. 1:105-123.

COX, D. R., 1962  Further results on tests of separate families of hypotheses. J. R. Stat. Soc. B 24:406-424.

CRESSIE, N. A., 1993 Statistics for Spatial Data. John Wiley and Sons, New York.

ENGSTROM, G., L. E. LILIJEDAHL, M. RASMUSON, and T. BJORKLUND, 1989  Expression of genetic and environmental variation during ageing: 1. Estimation of variance components for number of adult offspring in Drosophila melanogaster. Theor. Appl. Genet. 77:119-122.

FALCONER, D. S., 1989 Introduction to Quantitative Genetics, Ed. 3. Longman, New York.

FELLER, W., 1968 An Introduction to Probability Theory and its Applications, Vol. 1, Ed. 3. John Wiley and Sons, New York.

GEBHARDT-HENRICH, S. G. and H. L. MARKS, 1993  Heritabilities of growth curve parameters and age-specific expression of genetic variation under two different feeding regimes in Japanese quail (Coturnix coturnix japonica). Genet. Res. 62:42-55.

GOMULKIEWICZ, R. and M. KIRKPATRICK, 1992  Quantitative genetics and the evolution of reaction norms. Evolution 46:390-411.

HOEL, P. G., S. C. PORT and C. STONE, 1972 Introduction to Stochastic Processes. Houghton Mifflin, Boston.

HOULE, D., 1992  Comparing evolvability and variability of quantitative traits. Genetics 130:195-204[Abstract].

HOULE, D., K. A. HUGHES, D. K. HOFFMASTER, J. IHARA, and S. ASSIMACOPOULOS et al., 1994  The effects of spontaneous mutation on quantitative traits. I. Variances and covariances of life history traits. Genetics 138:773-785[Abstract].

HUGHES, K. A. and B. CHARLESWORTH, 1994  A genetic analysis of senescence in Drosophila.. Nature 367:64-66[Medline].

KIRKPATRICK, M., 1988 The evolution of size in size-structured populations, pp. 13–28 in The Dynamics of Size-Structured Populations, edited by B. EBENMAN and L. PERSSON. Springer-Verlag, Heidelberg, Germany.

KIRKPATRICK, M., 1997  Genetic improvement of livestock growth using infinite-dimensional analysis. Anim. Biotech. 8:55-56.

KIRKPATRICK, M. and N. HECKMAN, 1989  A quantitative genetic model for growth, shape, reaction norms, and other infinite-dimensional characters. J. Math. Biol. 27:429-450[Medline].

KIRKPATRICK, M. and D. LOFSVOLD, 1992  Measuring selection and constraint in the evolution of growth. Evolution 46:954-971.

KIRKPATRICK, M., D. LOFSVOLD, and M. BULMER, 1990  Analysis of the inheritance, selection and evolution of growth trajectories. Genetics 124:979-993[Abstract].

KIRKPATRICK, M., W. G. HILL, and R. THOMPSON, 1994  Estimating the covariance structure of traits during growth and ageing, illustrated with lactation in dairy cattle. Genet. Res. 64:57-69[Medline].

LANDE, R., 1979  Quantitative genetic analysis of multivariable evolution, applied to brain:body size allometry. Evolution 33:402-416.

LANDE, R., 1982  A quantitative genetic theory of life history evolution. Ecology 63:607-615.

LYNCH, M., and B. WALSH, 1998 Genetics and Analysis of Quantitative Traits. Sinauer Associates, Sunderland, MA.

MATHERON, G., 1988 Estimating and Choosing: An Essay on Probability in Practice. Springer-Verlag, New York.

MEYER, K. and W. G. HILL, 1997  Estimation of genetic and phenotypic covariance functions for longitudinal or `repeated' records by restricted maximum likelihood. Livest. Prod. Sci. 47:185-200.

PLETCHER, S. D., D. HOULE, and J. W. CURTSINGER, 1998  Age-specific properties of spontaneous mutations affecting mortality in Drosophila melanogaster.. Genetics 148:287-303[Abstract/Free Full Text].

PLETCHER, S. D., D. HOULE, and J. W. CURTSINGER, 1999  The evolution of age-specific mortality rates in Drosophila melanogaster: genetic divergence among unselected lines. Genetics 153:813-823[Abstract/Free Full Text].

PROMISLOW, D. E. L., M. TATAR, A. A. KHAZAELI, and J. W. CURTSINGER, 1996  Age-specific patterns of genetic variation in Drosophila melanogaster. I. Mortality. Genetics 143:839-848[Abstract].

SCHWARZ, G., 1978  Estimating the dimension of a model. Ann. Stat. 6:461-464.

SEARLE, S. R., G. CASELLA and C. E. MCCULLOCH, 1992 Variance Components. Wiley and Sons, New York.

SHAW, F. H. and C. J. GEYER, 1997  Estimation and testing in constrained covariance component models. Biometrika 84:95-102[Abstract/Free Full Text].

SHAW, R. G., 1987  Maximum-likelihood approaches applied to quantitative genetics of natural populations. Evolution 41:812-826.

SHAW, R. G., 1991  The comparison of quantitative genetic parameters between populations. Evolution 45:143-151.

SHAW, R. G., and F. H. SHAW, 1992 QUERCUS: programs for quantitative genetic analysis using maximum likelihood.

SHAW, R. G., G. A. J. PLATENKAMP, F. H. SHAW, and R. H. PODOLSKY, 1995  Quantitative genetics of response to competitors in Nemophila menziesii: a field experiment. Genetics 139:397-406[Abstract].

SIMONOFF, J. S., 1996 Smoothing Methods in Statistics. Springer-Verlag, New York.

STONE, M., 1979  Comments on model selection criteria of Akaike and Schwarz. J. R. Stat. Soc. Ser. B 41:276-278.

TATAR, M., D. E. L. PROMISLOW, A. A. KHAZAELI, and J. W. CURTSINGER, 1996  Age-specific patterns of genetic variation in Drosophila melanogaster. II. Fecundity and its genetic correlation with mortality. Genetics 143:849-858[Abstract].




This article has been cited by other articles:


Home page
GeneticsHome page
L. E. Bauman, J. S. Sinsheimer, E. M. Sobel, and K. Lange
Mixed Effects Models for Quantitative Trait Loci Mapping With Inbred Strains
Genetics, November 1, 2008; 180(3): 1743 - 1761.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
B.-R. Kim, L. Zhang, A. Berg, J. Fan, and R. Wu
A Computational Approach to the Functional Clustering of Periodic Gene-Expression Profiles
Genetics, October 1, 2008; 180(2): 821 - 834.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
R. Yang, H. Gao, X. Wang, J. Zhang, Z.-B. Zeng, and R. Wu
A Semiparametric Approach for Composite Functional Mapping of Dynamic Quantitative Traits
Genetics, November 1, 2007; 177(3): 1859 - 1870.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. D. Hadfield and A. J. Wilson
Multilevel Selection 3: Modeling the Effects of Interacting Individuals as a Function of Group Size
Genetics, September 1, 2007; 177(1): 667 - 668.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
Y. Cui, J. Zhu, and R. Wu
Functional mapping for genetic control of programmed cell death
Physiol Genomics, May 16, 2006; 25(3): 458 - 469.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
D. M. Rand, A. Fry, and L. Sheldahl
Nuclear-Mitochondrial Epistasis and Drosophila Aging: Introgression of Drosophila simulans mtDNA Modifies Longevity in D. melanogaster Nuclear Backgrounds
Genetics, January 1, 2006; 172(1): 329 - 341.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Macgregor, S. A. Knott, I. White, and P. M. Visscher
Quantitative Trait Locus Analysis of Longitudinal Quantitative Trait Data in Complex Pedigrees
Genetics, November 1, 2005; 171(3): 1365 - 1376.
[Abstract] [Full Text] [PDF]