- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Data Supplement
-
All Versions of this Article:
genetics.107.075697v1
177/4/2135 most recent - Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Pinkel, D.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Pinkel, D.
Originally published as Genetics Published Articles Ahead of Print on October 18, 2007.
Genetics, Vol. 177, 2135-2149, December 2007, Copyright © 2007
doi:10.1534/genetics.107.075697
Analytical Description of Mutational Effects in Competing Asexual Populations
Daniel Pinkel1
Comprehensive Cancer Center and Department of Laboratory Medicine, University of California, San Francisco, California 94143
1 Address for correspondence: University of California, Box 0808, San Francisco, CA 94143.
E-mail: pinkel{at}cc.ucsf.edu
>ABSTRACT
THE EXPERIMENTAL SYSTEM
THE ANALYTICAL FRAMEWORK
DESCRIPTION OF THE AVERAGE...
STOCHASTIC EFFECTS
ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
The adaptation of a population to a new environment is a result of selection operating on a suite of stochastically occurring mutations. This article presents an analytical approach to understanding the population dynamics during adaptation, specifically addressing a system in which periods of growth are separated by selection in bottlenecks. The analysis derives simple expressions for the average properties of the evolving population, including a quantitative description of progressive narrowing of the range of selection coefficients of the predominant mutant cells and of the proportion of mutant cells as a function of time. A complete statistical description of the bottlenecks is also presented, leading to a description of the stochastic behavior of the population in terms of effective mutation times. The effective mutation times are related to the actual mutation times by calculable probability distributions, similar to the selection coefficients being highly restricted in their probable values. This analytical approach is used to model recently published experimental data from a bacterial coculture experiment, and the results are compared to those of a numerical model published in conjunction with the data. Finally, experimental designs that may improve measurements of fitness distributions are suggested.
THE adaptation of asexual populations to a new environment is the result of selection operating on the suite of stochastically occurring mutations, each of which may confer a different selective advantage. The mutations occur throughout time, so that multiple clones of mutant cells are present simultaneously. Mathematical modeling of the population dynamics typically employs numerical simulation to calculate a particular instance of the system, sampling probability distributions to include stochastic effects. The calculations may involve detailed tracking of each mutant clone through the history of the population. Running the model many times allows determination of its characteristic behavior as a function of the parameters describing mutation and selection. Estimates of the values of these parameters in a living system are obtained by comparisons of the statistical properties of the model with those of experimental data.
By contrast, this article presents an analytical description of the population dynamics. The analysis begins by establishing the identity of the average behavior of sequential finite cultures separated by bottlenecks with the growth of an exponentially expanding effectively infinite culture. Consideration of the infinite system provides analytical expressions for characteristic properties of the finite populations. The results include quantitative descriptions of growth of the proportion of mutant cells with time and of the accompanying narrowing of the frequency distribution of their selection coefficients. Next, the stochastic behavior of finite systems is considered, resulting in a comprehensive and convenient description of the selection in the bottlenecks. The stochastic description is then used to develop a model of coculture experiments such as the one recently published by Hegreness and Shoresh (HS) (HEGRENESS et al. 2006). Application of the results obtained for the average behavior is used to simplify the model and facilitate its comparison with experimental data. Finally, the results are discussed and alternate experimental designs that may allow better measurement of the fitness distribution of the mutant cells are suggested. Significant additional information concerning the analysis is presented in the supplemental materials at http://www.genetics.org/supplemental/, including an Excel workbook for calculation of the statistical distributions that are developed in this article.
ABSTRACT
>THE EXPERIMENTAL SYSTEM
THE ANALYTICAL FRAMEWORK
DESCRIPTION OF THE AVERAGE...
STOCHASTIC EFFECTS
ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
(s) and an overall frequency µ per generation per cell. The cells grew exponentially for 24 hr, and the culture was then sampled to seed the next passage with
2 x 105 cells. After this bottleneck, exponential growth resumed, followed by sampling to seed the next daughter culture after another 24 hr, etc. The full experiment lasted
40 days or
450 doublings for the starting (ancestral) cells. Figure 1a illustrates the growth of the CFP and YFP populations just before and after the kth bottleneck for a time early in the series before the mutant populations have become significant. Each growth phase lasts a time
= 11.7 population doublings and resuts in an er
3300-fold increase in cell number, where r = ln(2). At the bottleneck, a sampling of e–r
1/3300 of the mutant and ancestral cells proceeded to the subsequent daughter culture. The rest of the cells were discarded.
|
YFP/CFP fluorescence ratios in each of 72 series of such cultures were measured at each bottleneck, providing a measure of the relative numbers of the two cell types as a function of time. Because of the stochastic nature of the mutational process and the passage of mutants through the bottlenecks, one expects that under some conditions the ratio may change with time, differing in each series of cultures. Figure 1b shows examples of four of the possible time courses for the ratio, using green and red to distinguish the populations. The ancestral cells in both populations are represented as light colors, and the mutants (of any s) as dark colors. This distinction is made for illustrative purposes only since ancestral and mutant cells cannot be distinguished by their fluorescence. While the colors are illustrated as separated, in the actual experiment the two populations are thoroughly mixed.
In the first example, mutant cells are assumed to come to prominence initially in the "green" population. This time is indicated by a green T1, defined in our analysis as the time when the proportions of mutant and ancestral cells in the green population are equal. The overall growth rate of the green population detectably increases, and green cells begin to overgrow the "red" ones. Mutant cells with approximately the same s as those in the green population are assumed to come to prominence in the red population at a later time, indicated by the red T1. After this time, the green and red populations tend toward the same growth rate, and thus their ratio stabilizes. Mutant cells completely dominate both populations with time, as indicated by the darkening colors. This growth pattern corresponds to curves 2 and 3 in Figure 6, a and b. Graphs above series 1 schematically show the growth of the mutant and ancestral populations during three cultures of this series.
|
In the second example, red mutants completely overtake the culture before green mutants of sufficiently large s come to prominence, although green mutations with low s will have occurred. This corresponds to curve 1 in Figure 6, a and b. In the third example, red mutants are assumed to come to prominence first and the fluorescence ratio begins to change in favor of red. At a later time, mutant cells with larger s come to prominence in the green population, and the ratio changes in favor of green. Still later, mutants that have a selection coefficient equivalent to the green population come to prominence in the red population, and the ratio stabilizes. The initial part of such behavior is shown by curve 5 in Figure 6b. Subsequent variations in the ratio may occur as mutants with ever larger s come to prominence in the two populations. But if
(s) has a sufficiently defined maximum s, then the ratio will finally stabilize when both populations are dominated by mutant cells with this maximal selection coefficient unless one population completely displaces the other as shown in example 2. In the final example, roughly equivalent mutants arise in both populations at about the same times, so that the ratio remains constant as both populations become dominated by mutant cells with selection coefficients tending toward the maximum possible. This is the behavior that always occurs if the size of the cultures is sufficiently large so that many mutations occur. ABSTRACT
THE EXPERIMENTAL SYSTEM
>THE ANALYTICAL FRAMEWORK
DESCRIPTION OF THE AVERAGE...
STOCHASTIC EFFECTS
ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
, but instead of seeding only one next passage culture as in the actual experiment, all of the material from each initial culture is used to seed er
daughter cultures (3300 in the case of the specific experiment in question). The details of this exhaustive sampling are illustrated in the oval inset in Figure 2a for the kth bottleneck. Each of the (k +1)st cultures receive
N +
N ancestral and m(s) +
m(s) mutant cells, where the
's indicate stochastic fluctuations that affect the individual cultures. The actual experiment is equivalent to selecting 72 of the initial cultures, and for each of these selecting a sequence of daughters, thereby producing 72 series with 40 sequential cultures in each. Two such series are indicated in Figure 2a by the shaded boxes. Examination of Figure 2a shows that the time dependence of the characteristics of the mutant cells averaged over a large number of series of daughters would be identical to the behavior of the total mutant population if all of the daughters were pooled. This pooled culture is just a population in unbounded exponential growth, which can be analyzed with straightforward approaches. Since the structure of this conceptual experiment preserves cell lineages, it also provides the basis for the subsequent calculation of the stochastic properties of the system.
|
This conceptual experiment is, of course, impossible to implement. Using the parameters of the actual experiment, by the 13th of the 40 days, the volume of culture medium required for the progeny of a single initial culture would fill a ball with a radius about 22 times that of our solar system out to Pluto, and the radius would be increasing faster than the speed of light. Coincidentally, this is about the characteristic time,
150 doublings, that mutants became equal in abundance to the ancestral cells in the actual HS experiment (see below). ABSTRACT
THE EXPERIMENTAL SYSTEM
THE ANALYTICAL FRAMEWORK
>DESCRIPTION OF THE AVERAGE...
STOCHASTIC EFFECTS
ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
Basic relationships:
Consider one of the cell populations in the infinite pooled culture of Figure 2a. Its development is described by standard equations for exponential growth. Let N
(t) and M
(s, t) be the number of ancestral and mutant cells at time t. Then
![]() | (1a) |
![]() | (1b) |
(s) is the probability distribution for the selection coefficients s of the mutants, and r = ln(2). Equation 1a describes the increase of ancestral cells due to division and the decrease due to conversion to mutants. Since µ << 1, it is neglected in what follows. The coefficient r scales time so that it is measured in units of the doubling time for the ancestral cells. Equation 1b describes the increase in the number of mutant cells with selection coefficient s through de novo mutation and the division of existing mutants with a rate 1 + s times that of the ancestral cells. In the real world, singly mutant cells are susceptible to additional mutations. The possibility of multiple mutations raises complex modeling issues that are discussed in section 2 of the supplemental materials at http://www.genetics.org/supplemental/. Multiple mutants are neglected in what follows.
This continuous growth model (with bottlenecks introduced below) does not include one important component of the actual experiment. In the real experiment the cultures enter a stationary phase where growth stops prior to seeding the daughter cultures. As mutants become prominent in the population and the overall growth rate increases somewhat, this stationary phase will be reached earlier during each passage. By contrast, the model allows expansion to continue during this stationary period. The error introduced by this simplification is small for the HS experiment. As is shown below, the maximum selection coefficient for the mutants in the experiment is
0.1. Thus, after cultures become dominated by mutants, in the time equivalent to 11.7 doublings of the ancestral cells the model allows about one additional doubling per passage (rs
1). Therefore, during the late phases of the experiment the timescale in the model may be somewhat accelerated compared to that of the actual cultures. However, inclusion of this small effect in the analysis is not warranted given the noise in the experimental data with which it will be compared. The presence of the stationary phase raises additional interpretive issues, since as indicated in the DISCUSSION the selective advantage of mutants may be expressed by changes in their behavior as they cease and resume proliferation.
The solution of Equation 1a is N
(t) = N
ert, where N
is the number of ancestral cells at t = 0. Inserting N
(t) into Equation 1b allows solving for M
(s, t). It is convenient for the subsequent discussion to calculate Rm(s, t), the ratio of mutant cells with selection coefficient s to the total number of ancestral cells, because this describes how significant the mutant population has become:
![]() |
![]() | (2) |
![]() |
(s) with the weight factor W(s, t).
At small values of st, W(s, t) = t, and Rm(s, t) = µ t
(s), indicating the buildup of de novo mutations linearly with time and with a distribution in s given by
(s). As s and/or t increase, W(s, t) increases dramatically due to the relative expansion of mutations that occurred early in the cultures. Figure 2b shows the shape of W(s, t) for 0 < s < 0.1 and times t = 1, 11.7 (the first bottleneck in the actual experiment), 100, and 200 population doublings. For plotting convenience these graphs have been normalized to the values of W(s, t) at s = 0.1. As time progresses, the weight of this function becomes concentrated at higher values of s. Thus for almost any shape of
(s) that one might choose, the width of the range of selection coefficients that are prominent in the mutant population will become narrower as time progresses. For most shapes of
(s) the "effective" selection coefficients will fall within a narrow range near the maximum s that is possible. An alternate derivation of Equation 2, which has the flexibility to address more complex systems, is given in section 1 of the supplemental materials at http://www.genetics.org/supplemental/.
Given Equation 2, the ratio of the number of mutant cells with selection coefficients between 0 and some value of s to the total number of ancestral cells at time t, denoted by Pm(s, t), can be calculated. Thus
![]() | (3) |
, t), the ratio of the total number of mutant cells to the ancestral cells at time t. The fraction of mutant cells with selection coefficients between any values s1 and s2 is then given by
![]() | (4) |
(s). The range of effective selection coefficients and the characteristic times T1 and T100, for which the number of mutant cells are respectively equal to, or 100 times greater than, the ancestral population are now compared for two specific choices
(s).
Comparison of uniform and delta-function distributions for
(s):
Consider first a uniform distribution
(s) = 1/smax for 0
s
smax and 0 otherwise. The product
(s)W(s, t) is W(s, t)/smax for 0
s
smax and so has the shape of W(s, t) up to smax, whereupon it drops to 0. Figure 2b shows the shape of this function, indicating that as time increases the predominant mutants have selection coefficients progressively closer to smax. Applying Equations 3 and 4 allows calculation of the range of s of the effective mutations:
![]() | (5) |
![]() | (6) |
![]() |
100 to
230 generations for s
0.1. Section 4 of the supplemental materials at http://www.genetics.org/supplemental/ and Figure 3b present an approximation to Equation 5 and I(rst) for low values of rst.
|
The range of s containing 90% of the mutant cells is arbitrarily defined as the range of effective selection coefficients. Since Pm(s, t) is monotonically increasing, the effective range can be obtained by determining the selection coefficient, s10, for which 10% of mutants have lower s. Setting Q(0, s10) = 0.1 in Equation 4 yields s10. Since Pm(0, t) = 0, one finds
![]() | (7) |
![]() | (8) |
The highly peaked distribution in s of the mutant population at long times even though
(s) was assumed to be flat suggests that some aspects of the behavior of cultures of these cells would be similar to a system in which one assumed a delta-function distribution of selection coefficients. Using
(s) =
(s –
) and a mutation rate of µ
in Equation 3, one finds analogous to quation 5,
![]() | (9) |
![]() |
This result for the delta function is identical in form to the (approximate) result for the uniform distribution of
(s) found in Equation 6. Evaluating Equation 6 at s = smax so that the entire mutant population is represented, Equations 4 and 9 are quantitatively identical if one sets
= 0.9smax and µ
= 0.29µ. Thus after the initial period where they are clearly distinct, the total number of mutant cells evolves with approximately the same time course for both of these forms for
(s). The delta-function approximation is appropriate for any
(s) that results in a highly peaked shape for
(s)W(s, t) at long times. The scaling of the overall mutation rates and the "equivalent" selection coefficients will depend on the details of the shape of
(s) and may also depend on which aspects of the cultures are being modeled. Some critical aspects of real systems cannot be modeled by the delta-function approximation, as shown in Figure 6 and the related text.
Characteristic times:
There are at least two times in the history of the culture that are interesting to calculate. The most relevant is T1, the time when the numbers of mutant and ancestral cells in the culture are equal. While this differs in each series of finite cultures as shown in Figure 1b, it has a well-defined value for the effectively infinite system of Figure 2a. This characteristic time,
is the time when
which can be determined from Equation 3 for any assumed
(s). For the specific case of the uniform fitness distribution,
or
Similarly, for a delta-function distribution,
Using Equations 6 and 9 one finds
![]() | (10) |
The second relevant time is the time for "fixation" of the mutations, which following HS is defined as the time when mutants are 100 times the abundance of ancestral cells. At this time for the uniform distribution
or
while similarly for the delta-function distribution
Thus from Equations 6 and 9,
![]() | (11) |
The range of selection coefficients that are prominent in the population at these times for the uniform distribution can be calculated using Equation 8. Using µ = 10–5 and smax
0.12, values used by HS for their simulation of a uniform
(s) described in their Figure 1B, Equation 10 yields
generations. At that time 90% of the mutant cells are in the range
s/smax
0.23, or 0.089 < s < 0.12. Similarly, from Equation 11
generations, and at that time
s/smax = 0.16, or 0.10 < s < 0.12. Thus at these times 90% of the mutant cells have selection coefficients very near the maximum possible.
ABSTRACT
THE EXPERIMENTAL SYSTEM
THE ANALYTICAL FRAMEWORK
DESCRIPTION OF THE AVERAGE...
>STOCHASTIC EFFECTS
ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
new mutants arise, the corresponding selection coefficients si of the mutants, and the effects of the selection bottlenecks. This section demonstrates that the stochastic effect of the multiple bottlenecks is equivalent to altering the actual mutation times
to effective times
and derives probability distributions for
given
s, and the length of the growth periods,
. This formalism, coupled with statistical sampling of the mutation times on the basis of the mutation rate and sampling of the selection coefficients on the basis of
(s), allows a complete description of the stochastic behavior of the system. The analysis proceeds by first calculating the statistics of the behavior of a single-mutant clone with a defined mutation time and selection coefficient and then combining multiple mutants to describe the system completely.
Stochastic behavior of a single-mutant clone:
Figure 4 shows the details of the behavior of the progeny of a mutant cell expanding through a series of daughter cultures such as described in the conceptual experiment Figure 2a. No cells are discarded. Assume that the mutation occurred at time 0
tm
, where
is the length of each of the culture periods. At the kth bottleneck, t = k
, and the average number of mutants, m(k
), transferred to each daughter is the total number of progeny from this mutation divided by the total number of daughter cultures:
![]() | (12) |
|
Due to statistical fluctuations in the bottlenecks, the actual number of cells each daughter receives is distributed around this average. As illustrated in Figure 4, after the second and subsequent bottlenecks, daughters with the same number of mutants can descend from different predecessors, so that the probability,
, of a daughter receiving n mutant cells after the kth bottleneck requires summing over these multiple possibilities. Suppose that q mutants were transferred into a daughter after the (k – 1)st bottleneck. At the next bottleneck, the kth, these have expanded so that on average qers
mutant cells will be distributed to each of its er
daughters. The statistical distribution in the number n received by these daughters is given by a probability distribution p(n:qers
). But the probability of having a culture with q initial cells is given by
. Thus,
![]() | (13) |
= ers
, k
2, and the Poisson probability distribution has been used because it is appropriate, at least initially, when the number of mutant cells is substantially smaller than the total number of cells. As shown below, this condition is met for the entire time period during which stochastic variation is important. As can be seen from examination of Figure 4,
= p(n:m(
)), where from quation 12, m(
) = ers
–r(1+s)tm =
e–r(1+s)tm. Evaluation of Equation 13 proceeds by using
to calculate
, etc. The supplemental materials at http://www.genetics.org/supplemental/ contains an Excel workbook that performs this calculation, as well as the calculations of the related statistical distributions discussed below. Note that
depends on
, s, and tm through m and
.
The basic behavior of Equation 13 is easily understood. Beginning with a Poisson distribution after the first bottleneck, it progressively broadens after subsequent bottlenecks. Figure 5a shows plots of
for k = 1–7 when tm = 1 and s = 0.09. Note that
varies smoothly and progressively more slowly as k increases, except between n = 0 and 1 where there is a pronounced discontinuity that increases with increasing k. Initially the broadening of
is due to the combination of stochastic events in the bottlenecks and the expansion of the mutant clone. As the number of mutant cells increases with time, stochastic variation in the number transferred from a particular culture to its immediate daughters eventually becomes insignificant compared to the mean number transferred since
n/n
n–1/2. Thus it is expected that the important stochastic variation induced by the bottlenecks occurs early in the experiment. This expectation can be quantitatively described in the following manner.
|
Let
k(n) be the ratio of the actual number of mutant cells n in a particular culture to the average number (Equation 12) after the kth bottleneck. Then
![]() | (14) |
The values of
k(n) for daughters receiving different numbers of mutants after the second bottleneck are illustrated in Figure 4. The average value of
k(n) = 1. Although n is rigorously an integer so that only specific values of
k(n) can occur, the slow variation of
with n, Figure 5a, allows an accurate description of the system to be obtained by treating n as a continuous variable and defining
k(n) as a continuous probability density that has the values
for integer values of n. For noninteger n,
k(n) can be obtained by interpolation. This approximation is valid for n
1, but not for n = 0 due to the substantial discontinuity in
between n = 0 and 1. Thus in the calculations that follow, probabilities corresponding to n = 0 will be given by
, while those corresponding to other values of n will be based on calculations that treat parameters as continuous. Using this approximation, which is increasingly accurate as k increases, the probability distribution,
k(
k), for
k, after each bottleneck can be calculated using Equations 12–14,
![]() | (15) |
) is given by Equation 12. Equation 15 holds for
k > 0. The probability of having
k = 0, which corresponds to n = 0, is
.
Figure 5b shows behavior of
k(
k) and
for k = 1–7, s = 0.09,
= 11.7, and tm = 1. (The distributions for other parameter values can be calculated using the Excel sheet in the supplemental materials at http://www.genetics.org/supplemental/.) Note that the shape changes substantially for the first few bottlenecks, but for k
4 it becomes constant. This is the quantitative description of the previous statement that once the number of progeny of the mutated cell becomes large, no significant additional variation is introduced by the subsequent bottlenecks. Thus a single distribution,
(
), calculable from first principles, can be used to describe the behavior of the mutants after sufficient time. The distribution for each mutant clone depends on the actual time the mutation occurred, the selection coefficient, and the length of the growth periods. In this example calculation, the stable distribution is reached by the analysis of cultures containing up to 100 mutant cells, 1000-fold less than the number of ancestral cells present in the actual experiment. Thus statistical stability is reached well before the time T1. For comparison, Figure 5b also shows a Poisson distribution scaled so that its maximum is located at the same position as the maximum of the limiting mutant distribution. The enhanced broadening due to the series of bottlenecks is evident.
The fact that a stable, readily calculable probability distribution can be used to describe the progeny of each mutation after several bottlenecks allows a particularly simple general description of the stochastic behavior of the entire system. Consider the ratio,
of the average number of mutant cells to the number of ancestral cells per culture:
![]() | (16) |
The actual ratio in the daughter cultures, R(t), will differ from this ratio by the factor
that has developed due to stochastic variation in the earlier bottlenecks. Therefore, for times long enough after the mutation for statistical stability to be established one has
![]() | (17) |
![]() | (18) |
is the effective time the mutation occurred, adjusted from the actual tm by
tm = –ln(
)/r(1 + s) due to the stochastic effects in the bottlenecks, and
is the size of the ancestral population at the effective mutation time. Negative values of
tm indicate a stochastic fluctuation that increases the abundance of a mutant clone relative to its average value; e.g., it appears that the mutation occurred earlier than it actually did, while positive values indicate a stochastic decrease relative to the average since the effective occurrence time was later than the actual time. If a mutant is lost from a series of cultures,
tm =
. Note that Equation 18 assumes that 0
tm
.
Calculation of the probability distribution,
(
), for
allows a statistical description of the behavior of the mutant cells. Using the continuous variable approximation discussed prior to Equation 15,
![]() | (19a) |
![]() | (19b) |
, and Equation 15 was used to relate
k(
) to
Equation 19b demonstrates that
k(
) becomes independent of k after several bottlenecks since
becomes independent of k. As a practical matter,
k(
) is calculated by choosing n and m(k
) for any bottleneck after stabilization of the distributions, with m(k
) given by Equation 12. Equation 19 is valid for finite
. For
, which corresponds to having n = 0 progeny of the mutant,
*(
) =
. The stabilization of the distribution for
after several bottlenecks means that the
of a mutation in a particular daughter culture remains the same for all of its descendant cultures. Thus the effective mutation times provide a suitable basis for describing the stochastic character of the long-term evolution of the system.
Figure 5c shows the behavior of
(
) and
*(
) for tm = 0, 1, 2, and 5, with s = 0.1 and
= 11.7. Given the assumed parameters, if a mutation actually occurs at tm = 0, then on average
2.25 mutant cells from this clone will be transferred to each daughter at the first bottleneck, so that most will receive at least one cell. For larger tm, the proportion of daughter lineages that receive no progeny of the mutant (e.g., have
=
) increases, correspondingly reducing the magnitude of
(
) for finite
. Therefore, to conveniently visualize the behavior of
(
) for different tm in one figure,
(
)/(1 –
*(
)) has been plotted. Note that the distributions of effective mutation times for those daughter cultures that have at least one cell with this mutation include
= 0 and have widths of several doubling times. Thus if the progeny of a mutation are not lost in a series of cultures, they behave as if their founding mutation occurred near the beginning of the culture period in which they originated. As tm increases the distributions move somewhat toward higher values
, but eventually stabilize except for a decrease in overall magnitude that has been normalized in this plot. The stabilization comes about since as tm increases eventually all daughter cultures receive either 0 or 1 cell at the first bottleneck, and those that receive a cell subsequently develop with similar statistical behavior.
Description of multiple mutations:
Equations 18 and 19 allow a complete description of the multiple mutations that occur during growth in the cultures. The ratio of the total number of mutant to ancestral cells in daughter culture,
is obtained by summing over all mutations. Using Equation 18,
![]() | (20a) |
![]() | (20b) |
and with selection coefficient ski and effective mutation time of
, and the times in the denominator of Equation 20a are adjusted to account for the fact that after each bottleneck the culture returns to having N0 ancestral cells. To simplify interpretation of Equation 20a and calculation of the effective mutation times, let
and correspondingly
, where
is the actual mutation time and
is the effective mutation time measured relative to the time of the kth bottleneck. Thus the distributions for the
can be calculated using Equation 19, employing the adjusted mutation times
The evaluation of Equation 20 formally requires random sampling from the mutation rate distribution times the instantaneous population of ancestral cells to obtain the mutation times, assigning a selection coefficient to each mutation by sampling from the distribution
(s), and finally assigning the corresponding effective mutation times by sampling from the appropriate
(
) calculated from Equation 19. However, many aspects of its general behavior can be understood much more simply, as discussed in section 3 of the supplemental materials at http://www.genetics.org/supplemental/. Note that
is the finite-system analog of the previously derived Pm(
, t) from Equation 3. Pm(
, t) is the average value of Equation 20 over multiple finite cultures.
ABSTRACT
THE EXPERIMENTAL SYSTEM
THE ANALYTICAL FRAMEWORK
DESCRIPTION OF THE AVERAGE...
STOCHASTIC EFFECTS
>ANALYSIS OF THE YFP/CFP...
DISCUSSION
LITERATURE CITED
Calculation of the fluorescence ratio:
Equation 20 allows a full description of the stochastic behavior of the YFP/CFP cocultures. In the coculture the differentially labeled populations grow independently and are independently subject to the statistics of mutation formation and bottlenecks. Thus, if NY and NC are the total numbers of YFP and CFP cells, respectively, and NYA and NCA are the corresponding numbers of ancestral cells, then
![]() | (21) |
and/or
1. Around this time, whose characteristic value is
variations in Log10(NY/NC) may develop. The behavior depends on the experimental design and mutational properties of the YFP and CFP populations, which are incorporated into evaluation of Equation 20.
While Equation 20 appears complex, it can be substantially simplified in a manner that preserves its quantitative accuracy and facilitates understanding of the essential factors that affect the behavior of the cultures. The simplification results from recognizing that only a small subset of the terms is significant and that key parameters are restricted in their values. First, most of the terms in Equation 20 are equal to 0 because the effective mutation time for most mutants is infinite, as indicated by the approach of
*(
) to 1 as
increases during a culture period (Figure 5c) (e.g., most are lost due to "drift" in the bottlenecks). Additionally, the nonzero terms have effective mutation times clustered near 0, as also indicated by Figure 5c. Moreover, Equation 2 shows that as time progresses the cells that constitute an appreciable proportion of the mutant population will have selection coefficients in the range where
(s)W(s, t) has significant magnitude, which is typically very narrow. Finally, a mutant clone originating at an effective time
contributes substantially to Equation 20 only if its selection coefficient is larger than any of the selection coefficients of mutations with smaller
. Therefore the most significant contributors to the mutant population come from the ordered subset of mutations [
, ski], where both monotonically increase and the ski are restricted to a narrow range just below the maximum available selection coefficient, while the
are near 0.
Thus as time passes clonal succession occurs, with the population being dominated by mutants with selection coefficients tending toward the largest available. Section 4 of the supplemental materials at http://www.genetics.org/supplemental/ calculates the average number of newly arising mutant cells transferred from a culture to its daughters, elucidating the buildup of the total mutant population.
Consider the YFP/CFP coculture at the time period around
the time when
and
1 on average. The behavior of the population ratio depends on the stochastically generated differences in
and
which are related to how densely the ranges of effective mutation times and selections coefficients are sampled in the two mutant populations. Imagine that N0 for both the YFP and the CFP populations is very large, so that at
which is independent of N0, many terms are required to make
1. Note that
contains the factor 1/N0 (Equation 20) so that increasingly more terms are required as N0 increases. Biologically the increase in the number of terms comes from the proportional increase in the number of mutations that occur due to the larger population size. But if there are many significant terms, then the intervals of both the effective mutation times and selection coefficients between sequential terms in the ordered subset [
, ski] must be small due to their limited ranges of the effective values. Although the specific values in [
, ski] will differ for the YFP and CFP populations due to the stochastic effects, the behavior of the
's is insensitive to these differences and for both populations approaches that of Pm(
, t) given by Equation 3. Thus the YFP and CFP populations evolve indistinguishably in terms of cell numbers, and the population ratio remains constant. All cocultures will appear to behave the same, and no indication of the internal stochastic differences will be measurable.
As N0 is decreased, fewer terms are required in Equation 20 to result in
1 at t
so the intervals between sequential terms in [
, ski] increase. The stochastic differences between the exact terms included in [
, ski] for the two populations lead to increasing possibilities for differential behavior between
and
so that greater variation in the population ratio will occur among a group of "identical" cocultures. The easily measurable effects will include increased variability in the T1's, increased magnitude of the ratio variations that develop, and a larger proportion of cultures in which the ratio variations are so large that one population effectively displaces the other. If the initial YFP and CFP populations differ substantially in size, for example, one is "small" and the other is "large," then their stochastic behaviors may be quite different in character. Section 5 of the supplemental materials at http://www.genetics.org/supplemental/ presents a rough method of estimating the number of significant terms in Equation 20 or, conversely, estimating the population size above which ratio variations are not expected.
The effect of population size on the stochastic variations in the population ratio can be appreciated by examination of the experimental data of HS (reproduced in section 6 of the supplemental materials at http://www.genetics.org/supplemental/). If their initial population size were increased by a factor of 10, the expected ratio behavior could be estimated by averaging randomly selected sets of 10 of their experimental curves, after first properly transforming the data prior to averaging. Clearly the ratio deviations would begin at approximately the same time, indicating the constancy of
but would be of substantially reduced magnitude.
Equation 22 shows Equation 21 with the first few nonzero terms explicitly displayed,
![]() | (22) |
(s) is a delta function, and only one mutant clone is contributing significant progeny in each population. Since all the selection coefficients are equal, if ratio fluctuations occur due to differences in the effective mutation times, the ratio will stabilize at some constant value after mutant cells dominate both populations. Mathematically the function always reaches a stable value, curve 2. However, this value may be so extreme that it represents the extinction of one of the populations, as illustrated by curve 1. The parameters for the curves are shown at the bottom of Figure 6. The appropriateness of Equation 22 for describing real experiments can be assessed by comparison of its behavior to the experimental data of HS. The overall shapes of the curves in the time period where they begin to depart from Log10(NY/NC) = 0 and the rough magnitudes of the ranges of variation of the curves for the model and the experimental data are qualitatively similar. Thus a model containing single significant mutant clones in the two populations, and employing effective mutation times consistent with the range defined by Figure 5c, reasonably describes these aspects of the data. However, Figure 6a does not properly describe the long-term behavior of the ratio.
The experimental data clearly show that as mutant cells become dominant in both populations, the slope of Log10(NY/NC) becomes small, but it is not typically 0. This is a clear indication of differences in the selection coefficients of the YFP and CFP mutants that are most prevalent in that time period. This behavior is modeled very well by employing differences in both the selection coefficients and the effective mutation times in Equation 22, as shown in Figure 6b. All but one of the curves in Figure 6b use one exponential term (one mutant clone) to describe the mutants in each population. The slopes of the curves after the mutants have overgrown the ancestral cells are directly related to selection coefficients of the dominant mutant clones. Taking the derivative of Equation 22 and assuming only one exponential term in the numerator and the denominator, one finds
![]() | (23) |
In summary, if the population sizes are sufficiently large, multiple mutations will contribute significant proportions of mutant cells at times on the order of
This corresponds to having many significant terms in Equation 20, and ratio variations will be of low amplitude. If the population sizes are sufficiently small, then Equation 22 with only one exponential term in the numerator and the denominator describes the experimental system, and one expects substantial ratio variations. Moreover, the smaller the population sizes are the larger the proportion of culture series that will be entirely overtaken by either the YFP or the CFP cells, e.g., curve 1 in Figure 6, a and b.
Measurement of bacterial characteristics from the experimental data:
As indicated in the discussion following Equation 22, the experimental data from HS are qualitatively described by a very simple model containing only one prominent mutant clone in the YFP and CFP populations. Quantitative determination of biological parameters of the bacteria based on the model is straightforward in this case. Once these parameters are obtained, they can be checked to determine if they are quantitatively consistent with the single-mutant clone description.
Estimation of
:
can be determined by examining the timing of the initial departures of Log10(NY/NC) from 0. T1 is the time at which
for a particular population. If this happens in the YFP population significantly prior to the CFP population, then T1 is just the time when Log10(NY/NC) = 0.3 (or –0.3 if the reverse occurs). More generally, both
and
have some significant value, so that |Log10(NY/NC)| will be <0.3 at t
Thus a reasonable estimate for
is the time when many of the experimentally measured traces of Log10(NY/NC) from a collection of culture series have significant departures from 0, but do not fully reach ±0.3. For the experimental data of HS,
150 generations. A more complex statistical fitting of Equation 21 to the data would produce a better estimate for
The value of such a procedure depends on the noise level of the data.
Estimation of the maximum effective selection coefficient for the mutations:
The maximum value of s can be estimated by examining the population-ratio curves for all of the experimental cultures to find the maximum slope of Log10(NY/NC) after its initial departure from 0. This typically occurs in a culture series where one population completely and rapidly overtakes the other, presumably because a mutation with a near-maximum selection coefficient had a very early effective occurrence time (or perhaps was preexisting in the population). This approach leads to an estimate of smax
0.11, using Equation 23 with one selection coefficient set equal to 0. To assist with the analysis, lines with various slopes have been added to the data figures of HS that are reproduced in the supplemental materials at http://www.genetics.org/supplemental/. The analysis assuming a single-mutant clone is self consistent. Using the procedure in section 5 of the supplemental materials at http://www.genetics.org/supplemental/, Equation S8 estimates that there are typically 1.1 significant mutant clones in both populations. Of course, some cultures will by chance have more than one significant mutant clone in one or both populations, but as mentioned before, these cultures will typically have relatively low population-ratio deviations, and initial slopes of the curves will be relatively small compared to the others.
Estimation of the width of the effective selection coefficients:
Examination of the slopes of Log10(NY/NC) for the experimental data at times after mutants dominate both populations finds
32 cultures series with 0 < |sY – sC| < 0.02 and only 5 with 0.02 < |sY – sC| < 0.04. (Given the large number of curves contained in the HS figures, and the measurement noise, it is difficult to be precise in these estimates.) Therefore most of the mutants must have nearly identical selection coefficients, and these must be very close to smax given the large slope of the curves during their initial departure from 0. Thus one would estimate that most of the selection coefficients of the mutants that are significant in these experiments fall into the range of 0.09–0.11. This is just the behavior expected for a system with an effective mutation distribution described by
(s)W(s, t) from Equation 2. Given the strong dependence of W(s, t) on s, and the noise in the data, it is very difficult to determine much about the actual shape of
(s) for the bacteria. Alternate experimental designs that may reveal more details of
(s) are proposed in the DISCUSSION.
Estimation of the mutation rate:
Finally, assuming a uniform distribution for the selection coefficients for new mutations, Equation 10a finds that µ
10–5 if
150 generations. The numerical value obtained for µ depends on the assumed form for
(s). The inferred mutation rate would be much lower if a larger proportion of the weight of
(s) were at higher selection coefficients and would be much higher if
(s) were preferentially weighted toward 0. For comparison, if
(s) were a delta function, the estimate from Equation 10b for the mutation rate would be
8.5 x 10–7. Since all shapes of
(s) lead to a very narrow distribution of effective selection coefficients, the value of the mutation rate for the delta-function distribution is an estimate of an effective mutation rate for the system—the other mutations that may occur will have limited discernible effect on the culture























. (c) Dependence of the probability distribution 










