- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Majewski, J.
- Articles by Cohan, F. M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Majewski, J.
- Articles by Cohan, F. M.
Adapt Globally, Act Locally: The Effect of Selective Sweeps on Bacterial Sequence Diversity
Jacek Majewski1,a and Frederick M. Cohanaa Department of Biology, Wesleyan University, Middletown, Connecticut 06459-0170
Corresponding author: Frederick M. Cohan, Department of Biology, Wesleyan University, Middletown, CT 06459-0170., fcohan{at}wesleyan.edu (E-mail)
Communicating editor: R. R. HUDSON
| ABSTRACT |
|---|
Previous studies have shown that genetic exchange in bacteria is too rare to prevent neutral sequence divergence between ecological populations. That is, despite genetic exchange, each population should diverge into its own DNA sequence-similarity cluster. In those studies, each selective sweep was limited to acting within a single ecological population. Here we postulate the existence of globally adaptive mutations, which may confer a selective advantage to all ecological populations constituting a metapopulation. Such adaptations cause global selective sweeps, which purge the divergence both within and between populations. We found that the effect of recurrent global selective sweeps on neutral sequence divergence is highly dependent on the mechanism of genetic exchange. Global selective sweeps can prevent populations from reaching high levels of neutral sequence divergence, but they cannot cause two populations to become identical in neutral sequence characters. The model supports the earlier conclusion that each ecological population of bacteria should form its own distinct DNA sequence-similarity cluster.
IT is becoming increasingly clear that a full accounting of ecological diversity in the bacterial world requires a molecular approach. Molecular techniques have demonstrated that only a small fraction of bacterial species are culturable (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
This interpretation is justified because in studies of more familiar and culturable taxa, bacterial systematists have found an empirical correspondence between ecologically distinct populations and sequence-similarity clusters. That is, groups of bacteria known to be ecologically different generally fall into separate sequence-similarity clusters (![]()
![]()
![]()
![]()
![]()
![]()
While ecologically distinct groups of bacteria are frequently distinguishable as separate sequence-similarity clusters, it is important to find a strong theoretical basis for this observation. If there are times when multiple ecological populations of bacteria fall together into the same sequence cluster, molecular approaches may severely underestimate bacterial biodiversity (![]()
![]()
![]()
![]()
![]()
Recent theory has shown why ecological populations should correspond to sequence clusters (![]()
![]()
![]()
![]()
![]()
![]()
![]()
The tendency for bacterial populations to form separate sequence clusters is opposed by recombination between populations (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Nevertheless, it is not clear that the existing model adequately predicts the degree of sequence divergence between ecological populations. Here we present an alternative and more general model for periodic selection, in which some mutations may be adaptive outside of the context of their original populations. In this model, the domain of competitive superiority of an adaptive mutant (i.e., the cell) is still its own ecological population, but the adaptive mutation (i.e., the allele) can be recombined into other populations, where it can confer higher fitness and cause a local selective sweep within each recipient population (Figure 1). This process may homogenize the populations for any segment that is cotransferred between populations along with the adaptive mutation. We have hypothesized that globally adaptive mutations could homogenize populations for neutral sequence diversity at all gene loci, provided that the size of fragments recombined is large enough and that universally adaptive mutations recur throughout the genome.
|
In this article, we present a coalescence model to explore the conditions under which universally adaptive mutations can homogenize neutral sequence diversity across ecological populations. We tested whether different ecological populations might fail to diverge into separate sequence-similarity clusters under the rates of recombination observed in bacteria. We also tested whether universally adaptive mutations may prevent populations with low recombintion rates from diverging without bound.
| THE MODEL |
|---|
Ecological populations and adaptive mutations:
A metapopulation consists of n closely related ecological populations, each containing N cells (Table 1). Each population is adapted to a different ecological niche. Recombination occurs rarely within and between these populations, and the metapopulation is closed to recombination with other such metapopulations.
|
Following ![]()
We assume that selective sweeps are rare events and that the duration of the sweep is short relative to the time between sweeps.
Rate of fixation of adaptive mutations:
Following ![]()
![]()
l = 2zµlN. It is assumed that once a globally adaptive mutation becomes fixed in its original population (with probability 2z), recurrent recombination and subsequent selection will cause the mutation to eventually become fixed in all populations. Therefore, globally adaptive mutations are fixed at a rate
g = 2zµgnN.
Recombination within and between populations:
Recombination in bacteria is unidirectional and the segment recombined is usually a small fraction of the genome (![]()
Recombination follows a modified island model, where cs is the rate (per gene segment per genome per generation) at which individuals integrate (as recipients) DNA at a gene segment of interest from other individuals of the same ecological population; cd is the rate at which individuals integrate DNA from any other ecological population; c is the total rate of recombination at which an individual integrates DNA from any other individual in the metapopulation; thus c = cs + cd. The value c
is the rate at which individuals integrate DNA from a particular ecological population (other than their own). In a metapopulation consisting of n ecological populations, c
=
.
Probability that a selective sweep leads to coalescence:
Our model determines the expected time (going backward from the present) to coalescence into a common ancestor for two homologous gene segments occurring today in two different individuals. These individuals may be cells from the same or different ecological populations of the metapopulation.
We define p as the probability that a selective sweep leads to coalescence at a gene segment of interest. This is the probability that two cells chosen from the metapopulation immediately following a selective sweep are identical by descent for the gene segment of interest. Whether a selective sweep results in coalescence at a gene of interest depends on the relative magnitudes of the selective advantage of the adaptive mutation, the rate at which recombination separates the gene of interest from the selected gene, and the population size. If the rate of recombination is high and the selective advantage low, the event is unlikely to lead to coalescence.
We consider several instances of the variable p, corresponding to the probabilities of coalescence within and between populations, for globally and locally adaptive mutations: pl is defined as the probability that a local selective sweep within a population leads to coalescence of segments from that population; pgs is the probability that a global selective sweep leads to coalescence of segments from the same population; and pgd is the probability that a global selective sweep leads to coalescence of segments from different populations of the metapopulation.
In Appendix 1, we derive a method (adapted from ![]()
![]() |
(1) |
Because the coalescence of homologous segments from different populations requires that the transfer of the adaptive mutation from population 1 to population 2 includes the segment of interest (Figure 1), pgd will be highly dependent on the probability of cotransfer.
We assume that the size (h) of the recombining DNA fragment is constant, while adaptive mutations occur randomly throughout the genome. Because we are interested in modeling the consequences of many selective sweeps, we need to calculate the mean probability (P) that a selective sweep leads to coalescence, averaged over all possible distances (y) between the neutral marker and the adaptive mutations (i.e., between 0 and 1/2 because the bacterial chromosome is circular):
![]() |
(2) |
Both the probabilities p and the above integral were evaluated numerically.
The coalescence model:
Our coalescence model calculates the expected time that two homologous gene segments (occurring in different organisms) have diverged since their last common ancestor. These gene segments are postulated to be short enough so that they are not split by recombination. The following are the expected times to coalescence for two strains from the same and different ecological populations, E(ts) and E(td) (derived in Appendix 1):
![]() |
(3) |
![]() |
(4) |
These equations were solved using Mathematica (![]()
Calculation of the expected nucleotide divergence:
The expected nucleotide sequence divergence is predicted using the probability density functions for ts and td following ![]()
) is then obtained by multiplying the time to coalescence by twice the per third base site rate of mutation (µ0),
![]() |
(5) |
where t = ts or td are the times until coalescence of segments in the same population or different populations, respectively.
The nucleotide sequence divergence over all sites,
, may be calculated by correcting for multiple substitutions per site (![]()
![]() |
(6) |
We used the probability density functions Pts(t) and Ptd(t) to calculate the expected divergence
![]() |
(7) |
The integration was carried out numerically. The expected nucleotide divergence between segments from the same population, E(
s), and different populations, E(
d), was calculated using Pts(t) and Ptd(t), respectively. The neutral nucleotide divergence after infinite time reaches a level of 1/4, which we refer to as "unbounded divergence."
| RESULTS |
|---|
The following parameter values were used in all numerical calculations: the neutral mutation rate per third base site, µ0 = 3 x 10-10; the selective advantage, z = 10-2; the population size, N = 5 x 1014; and the number of populations in the metapopulation, n = 2. Recombination rates within and between populations were set as equal (cs = c
; i.e., no sexual isolation between populations) to maximize the homogenizing effect of recombination.
The diversity-purging effect of an adaptive mutation:
The probability that a particular global selective sweep causes coalescence, within or between populations, is shown in Figure 2. The probability of coalescence within a population, pgs, is always near 1 because recombination is so rare in bacteria (see also ![]()
|
The ratio of globally to locally adaptive mutation rates:
We next explore the effect of recurrent adaptive mutations on population structure. We focus on the significance of the ratio of globally to locally adaptive mutations. We maintain the total frequency of adaptive mutations constant, while allowing the ratio of global:local adaptations to vary. We consider three relative frequencies of global:local events, 1:0, 1:1, and 0:1 (Figure 3).
|
Figure 3 shows that globally adaptive mutations reduce neutral sequence divergence between populations compared to the case with only local adaptations. This effect is most pronounced at low recombination rates. When only local selective sweeps are possible, the model shows that a recombination rate of 10-10 leads to unbounded neutral divergence between populations (i.e.,
d
1/4). Increasing the global:local ratio decreases the divergence between populations by up to 50-fold. The divergence within populations also decreases, but to a much lower extent. Thus, increasing the proportion of globally adaptive mutations makes the populations less distinct in neutral characters.
Consider next whether globally adaptive mutations can prevent different ecological populations from diverging into separate sequence clusters. We define populations as falling into separate sequence-similarity clusters when E(
d) > 2E(
s) (![]()
|
Analysis of a simplified model with no locally adaptive mutations:
We concentrated on the effect of globally adaptive mutations by considering the special case of a two-component metapopulation in which all the adaptive mutations are global (Figure 5). For this special case we treated the coalescence equations analytically to gain further insight into the behavior of the sequence divergence functions presented in Figure 3. Noting that bacterial populations are always large enough so that the probability of coalescence by drift is negligible relative to coalescence by periodic selection (i.e., 1/N <<
P), Equation 3 and Equation 4 reduce to
![]() |
(8) |
![]() |
(9) |
|
We may consider
gPgs and
gPgd as pseudoparameters, representing the diversity-purging effect of periodic selection (i.e., the rate of selective sweeps times the probability of coalescence within each sweep). The times to coalescence are then determined by only three factors: c
, the rate of recombination between populations;
gPgs, the within-population diversity-purging effect of global periodic selection; and
gPgd, the between-population diversity-purging effect of global periodic selection.
Consider the relative magnitudes of
gPgs and
gPgd. We used Equation A4 of Appendix 1 to calculate the values of Pgs and Pgd across the range of frequency of recombination (c) and selective advantage (z) considered in this article, and we found that Pgs
Pgd/h. We assume that the size of the recombination fragment h is usually <10% for the genome (see DISCUSSION). Therefore
gPgs >>
gPgd. This leaves four regions of magnitude for c
: c
>>
gPgs, c
~
gPgs, c
~
gPgd, and c
<<
gPgd. These regions correspond to regions I through IV, respectively, of Figure 5.
Region I of Figure 5, c
>>
gPgs >>
gPgd, corresponds to very high recombination rates, yielding the following approximation of Equation 8 and Equation 9:
![]() |
(10) |
Region I of the graph corresponds to the case where high rates of recombination within populations (cs) diminish the diversity-purging effect of periodic selection (Pgs), while high values of interpopulation recombination (c
) further homogenize the populations, making them indistinguishable (such that expected divergence levels within and between populations are equal).
The conditions in region II, c
~
gPgs >>
gPgd, yield
![]() |
(11) |
![]() |
(12) |
In region II of Figure 5, recombination is no longer sufficient to prevent populations from diverging. The divergence between populations is greater than that within populations and is determined by the equilibrium between recombination (which acts to homogenize the populations) and local diversity-purging events (which tend to keep the populations distinct).
The conditions of region III,
gPgs >>
gPgd ~ c
, yield
![]() |
(13) |
![]() |
(14) |
Region III reflects the increasing significance of global periodic selection. Divergence between populations is determined by the combined homogenizing effects of recombination (c
) and global periodic selection (
gPgd).
Region IV corresponds to the case of extremely rare recombination,
gPgs >>
gPgd >> c
, yielding
![]() |
(15) |
![]() |
(16) |
This is the limiting case, where recombination between populations becomes so infrequent that its effects are entirely overwhelmed by periodic selection. In this limit, the divergence within populations is determined solely by the intensity of local purging of diversity, while the divergence between populations is only limited by the intensity of global purging of diversity.
Under the conditions of rare recombination (region IV), the ratio of the times to coalescence (i.e., E[td]:E[ts]) approaches 1/h. This follows from two consequences of rare recombination. First, because the locus of interest and the adaptive mutation are rarely separated by recombination, a selective sweep almost certainly leads to coalescence of gene segments from the same population (i.e., Pgs
1). Second, the transmission of the adaptive mutation from population 1 to population 2 is likely to be the result of a single transfer event. Hence, the probability of coalescence of two gene segments from different populations (Pgd) approaches the probability that the transfer event was a cotransfer of the adaptive mutation and the segment of interest (averaged over all distances between the two loci). That is, Pgd
h, and E[td]/E[ts]
1/h.
Effect of recombination fragment size on population divergence:
We consider next the effect of the recombination fragment size (h) on the distinctness of ecological populations. In general, larger recombination fragments increase the probability (q) that a gene of interest will cotransfer across populations with an adaptive mutation (Equation 1), thus fostering coalescence of segments between populations (Figure 2). Hence, larger sizes of recombination fragments tend to make ecological populations appear less distinct in neutral characters (A, Figure 5). The effect of h on the distinctness of populations is most important at low between-population recombination rates (Figure 5).
The effect of h on population distinctness (quantified as E[
d]/E[
s]) is shown explicitly in Figure 6. Under very low rates of between-population recombination, the distinctness ratio approaches 1/h for large fragment sizes (i.e., h > 10%; Figure 6). Thus, global periodic selection alone (i.e., with little recombination between populations) cannot reduce the distinctness ratio of populations to 1 (so that E[
d]
E[
s] unless the recombination fragment size reaches 100% of the genome.
|
We explored in more detail the effect of h on population distinctness for the case when within-population divergence levels are 1% (i.e., E[
s] = 0.01), because this is the divergence level frequently observed within bacterial sequence-similarity clusters (![]()
d = 0.25) to a much more limited level of divergence (
d = 0.09; Figure 7).
|
| DISCUSSION |
|---|
This study presents a coalescence model for investigating the effect of globally adaptive mutations on neutral sequence divergence in bacteria. We used this model to test whether interpopulation transfer of globally adaptive mutations might prevent neutral sequence divergence between ecologically distinct populations of bacteria.
Assumptions of the model:
If globally adaptive mutations are to reduce divergence between ecological populations at every locus in the genome, we must assume that every gene locus has the opportunity to hitchhike from population to population along with globally adaptive mutations (Figure 1). We therefore assume that globally adaptive mutations that confer benefits in more than one population exist, that they are numerous, and that they appear throughout the genome. The latter two assumptions are required because only a limited fraction of the genome can be cotransferred (and subsequently homogenized) across populations with any given adaptive mutation: the segments transferred in bacterial recombination are generally small (![]()
![]()
![]()
Consider next the central premise of the model, that globally adaptive mutations exist and are numerous. The likelihood of globally adaptive mutations must depend on the degree of ecological divergence between populations. In the early stages of population divergence, a mutation that is adaptive in one population is likely to be adaptive in others. As the populations become progressively more finely tuned to their respective niches, accumulating many niche-specific adaptations, we should see fewer adaptive mutations that can benefit more than one population. We therefore expect globally adaptive mutations to prevent neutral sequence divergence genome-wide only between the most closely related populations.
Does a typical adaptive mutation confer a benefit in more than one population? Recently, ![]()
1), there is nearly total purging of diversity both within and between populations; for genes that are not linked to the adaptive mutation (q
0), there is purging of diversity within populations but none between populations. Provided that each of the E. coli sequence clusters is actually a separate ecological population (![]()
![]()
![]()
![]()
![]()
Globally adaptive mutations as a homogenizing force in neutral sequence evolution:
Analysis of our model has shown that, in general, globally adaptive mutations tend to make populations less distinct. Especially under extremely low recombination rates, globally adaptive mutations severely depress neutral sequence divergence between populations while having only a minor effect on within-population diversity (Figure 3). Populations that would diverge without bound in the absence of global periodic selection may be prevented from diverging without bound in the presence of global periodic selection.
Nevertheless, global periodic selection does not homogenize neutral sequence divergence to the extent that populations become indistinguishable. Consider, for example, ecological populations whose average within-population sequence divergence is ~1%, a value typical for sequence-similarity clusters in bacteria (![]()
![]()
![]()
![]()
![]()
Analysis of the model has shown that the effect of global adaptations on between-population divergence is highly dependent on the size of the fragment recombined (Figure 7). If the recombination fragment is small (<1% of the genome), global periodic selection is virtually ineffective in reducing between-population divergence; however, if the recombining fragment is large (e.g., 5% of the genome), global periodic selection may significantly reduce the between-population divergence (Figure 7).
The effect of global periodic selection on sequence divergence may therefore depend on the mode of genetic transfer between populations, because the various modes of transfer differ greatly in the length of DNA recombined. In naturally competent taxa, such as Streptoccus and Bacillus, transformation may be the predominant mode of DNA exchange. The average fragment of DNA incorporated in both Streptococcus and Bacillus transformation is <1% of the genome (![]()
![]()
Other modes of recombination, such as transduction and conjugation, can transfer much larger segments of DNA. A generalized transducing phage can, in principle, transfer segments as large as the phage's own genome, which could be ~10% of the bacterium's genome (![]()
![]()
In summary, global periodic selection can limit the sequence divergence between ecological populations. The effect of global periodic selection is most pronounced for groups of populations with low between-population recombination, such that global periodic selection is the only constraint on divergence between populations. Global periodic selection is unlikely to prevent the divergence of ecological populations into separate sequence clusters. A quantitative prediction of the homogenizing effect of global periodic selection would require more information about the rate of mutations that confer adaptations in multiple populations, information about how evenly globally adaptive mutations are distributed throughout the genome, and information about the size of fragments that can be transferred between populations and then successfully accommodated by the receiving population.
| FOOTNOTES |
|---|
1 Present address: Laboratory of Statistical Genetics, Box 192, Rockefeller University, 1230 York Ave., New York, NY 10021. ![]()
| ACKNOWLEDGMENTS |
|---|
We thank Michael Feldgarden for suggesting that we explore a model of global periodic selection and Richard Hudson for suggesting important improvements to the model. This work was supported by Environmental Protection Agency grants R82-1388-010 and R82-5348-010 and by research funds from Wesleyan University.
Manuscript received September 20, 1996; Accepted for publication April 23, 1999.
| APPENDIX A |
|---|
Probability That a Periodic Selection Event Leads to Coalescence
We consider the special case of a metapopulation consisting of two ecological populations. The adaptive mutation driving the periodic selection event begins in population 1 and is subsequently passed into population 2 by recombination. We use a two-locus, four-allele model. A is the locus under selection, while B is the segment of interest whose neutral sequence divergence we are investigating. Alleles in population 1 are designated by subscript 1; those in population 2 are designated by subscript 2. Within population 1, the advantageous allele is designated as A1, and all other alleles at the selected locus are designated a1. A2 is the advantageous allele in population 2; a2 designates all the other alleles at the selected locus in this population. An allele at locus B can be attached to any of the four A alleles, i.e., A1, a1, A2, a2. The frequencies of the alleles A1 and A2 in their respective populations are x1 and x2.
Let gX(Y,t) be the conditional probability that if a randomly selected gene B from generation t of the metapopulation is attached to the allelic type Y (at locus A), its ancestor in generation (t - 1) was attached to allelic type X. [gX(Y,t) is equivalent to the quantity fX(Y,t)/f(Y,t) of ![]()
![]()

where R11 = R22 = 2Ncs(1 - q), the per-population rate at which the two loci are separated by recombination within a population; R12 = R21 = 2Nc
(1 - q), the per-population rate at which the two loci are separated by recombination between populations; and R1 = R2 = Nc
q, the per-population rate at which a DNA segment containing both loci is transferred between populations. The remaining 8 g values may by obtained by substituting 2 for 1 and 1 for 2 in all the indices of the above equations, e.g.,

We now define the Q process. Suppose that m B genes are selected at random at the end of the selective sweep (time t = 0). Let Q(0) = (i, j, k, l), where i, j, k, l represent the number of B genes attached to A1, a1, A2, a2, respectively. Going back in time, Q(t) describes the number of ancestral B genes attached to each A allele at time t (i.e., t generations before time 0). The total number of ancestral B genes in generation t is denoted by |Q(t)|. Note that |Q(t)| never increases, because the number of ancestral alleles can only stay constant or decrease (if two or more of the sampled alleles had a common ancestor in the previous generation). We are interested in the cases where Q(t) changes states, i.e., Q(t - 1)
Q(t). There are two possible cases.
Case 1. |Q(t - 1)| = |Q(t)|:
The only possible state changes allowed by this condition result from recombination between parental genes. Given Q(t) = (i, j, k, l), there are 12 possible states of Q(t - 1):

Note that all other jumps would require more than a single recombination event and their probabilities are therefore of the order 1/N2 and are negligible. We are interested in the probabilities that the process jumps from (i, j, k, l), to any of the above states, e.g.,

The probability that a selected gene B from generation t is attached to a1 while its ancestor was attached to A1 is given by gA1(a1,t). Because we are sampling j a1 alleles,

Equations of the same form can be obtained for the remaining 11 jumps.
Case 2. |Q(t - 1)|
|Q(t)|:
We have already noted that the number of ancestral alleles can only decrease going backward in time. This case implies that some of the genes sampled at time t must have a common ancestor at time (t - 1). ![]()

where xa(t - 1) is the frequency of the allelic type a in the parental generation. Thus, the probability that two B alleles attached to A1 at t have a common ancestor at (t - 1) is

Because i B genes attached to A1 are being sampled,

where if i < 2,
is interpreted as 0. Similarly,

and because the chance of more than one coalescence event per generation is of the order of 1/N2, jumps of >1 state are ignored. We have now defined the probabilities of every possible state change of Q(t). The probability that the Q process does not change state can thus be written as

where hijkl is the total probability of the Q process changing states:

Calculation of pgd and pgs:
At the end of the selective sweep we sample two B alleles from the metapopulation. We want to know the probability that the two alleles had a common ancestor during the selective sweep. We consider two cases:
- Case 1: the two genes are sampled from different ecological populations. The probability of coalescence during the sweep is pgd.
- Case 2: the two genes are sampled from the same ecological population. The probability of coalescence during the sweep is pgs.
We begin by considering case 1, calculation of pgd. We follow the Q process back in time, going from t =
f (end of the selective sweep) to t =
b (beginning of selective sweep; Figure 8). We can write (1 - pgd) as the probability of escaping coalescence, i.e., leaving the selective sweep (t =
f) with one B gene in population 1 (attached to A1) and one B gene in population 2 (attached to A2) and entering it (t =
b) with two ancestral B genes:
![]() |
(A1) |
|
We also need to define Pijkl(t), the probability of finding the Q process in the state (i, j, k, l) at time (t), given that at the end of the selective sweep Q(
f) = (1,0,1,0),

Hence, we can rewrite Equation A1 as
![]() |
(A2) |
To calculate all the relevant Pijkl(
b) values, we use the differential equations governing their behavior:

![]() |
(A3) |
We also need the equations describing changes in the frequencies of the adaptive alleles A1 and A2, x1(t) and x2(t):

To be able to treat the model deterministically, we follow ![]()

where
(
) corresponds to the time in the early phase of the selective sweep, where x1 =
, and
2(1 -
) corresponds to a time near the end of the selective sweep, where x2 = 1 -
. We take
= 5/Nz.
It remains for us to establish the boundary conditions

while all other Pijkl(
2(1 -
)) are zero. Also,

Because the adaptive mutation enters population 2 later than it enters population 1, we must determine the value of x1 at the time when x2 = 1 -
. For that purpose we need to first run the frequency equations forward in time (starting at x1 =
, x2 = 0). We then allow for the transfer of a single adaptive mutation to population 2. This takes place at time E(
c), which is the expectation of the transfer of the first adaptive allele that will become fixed in population 2. (This is in fact equal to the time at which 1/2z alleles have crossed over.) We run the equations forward in time until
2(1 -
) to establish the end boundary conditions for x1; then we can run both x1 and x2 backward, along with the equations for Pijkl, following Equation A3 (see Figure 8).
The transfer of the first adaptive allele from population 1 to population 2 is a stochastic event and hence introduces a discontinuity in the solutions to differential equations for Pijkl. At the time immediately preceding the initial transfer of the adaptive allele into population 2 (time
c+), no A2 alleles existed. Therefore, at time
c+ the probabilities of a B allele being attached to an A2 allele [Pijkl(
c+) for k
0] must be zero. However, these P terms might have nonzero values at
2(
). Following ![]()
2[
]) to pgd. However, a nonzero value of Pijkl(
2[
]) implies that immediately after the transfer event, one of the sampled B genes is attached to an A2 allele. In this case, there exist two possible states of the Q process immediately preceding the transfer: Q(
c+) = (i + 1, j, 0, l) if the transfer was a corecombination of A and B (with a probability q); or Q(
c+) = (i, j, 0, l + 1) if only the A allele was transferred (with a probability q - 1). Hence, the additional boundary conditions at
c needed to account for the transfer are

Using the above boundary conditions, we can obtain the values Pijkl(
1[
]) at the beginning of the selective sweep. Then, using the approximation Pijkl(
1[
]) = Pijkl(
b), Equation A2 becomes
![]() |
(A4) |
The above analysis also applies to case 2, the calculation of pgs. We only need to alter the initial conditions, i.e., allowing Q(
f) = (2,0,0,0) (if we are sampling two alleles from the original population where the adaptive mutation first occurred) or Q(
f) = (0,0,2,0) (if we are sampling two alleles from the population to which the adaptive mutation has been transferred). Because the probabilities of choosing either population are equal, we calculate pgs as the average of the two values.
| APPENDIX B |
|---|
The Expected Time to Coalescence
Coalescence within populations
Following ![]()
The variable ts is thus defined below, where the first term ks represents the time to reach the most recent key event, and the other three terms represent the additional time necessary to go beyond the key event to reach a coalescence (Table 2):

|
Local selective sweep as the key event:
The random variable 
l indicates whether the key event was a local selective sweep (
l = 1 if the key event was a local selective sweep; 
l = 0 else). Two lineages that are in the same population at the end of the selective sweep may begin the sweep in one of three states: a single lineage (i.e., the lineages coalesce); two lineages in the same population; or they may begin as two lineages in different populations. The random variable
pl indicates whether the lineages escape coalescence and begin in the same population (1 if yes, 0 else, as above), and t's is the additional time to coalescence in this case. The random variable 
l indicates whether the lineages begin the selective sweep in different populations (1 if yes, 0 else), and t'd is the additional time to coalescence in this case. Accordingly, the sum (
plt's + 
lt'd) represents the additional time to coalescence when the key event is a local sweep.
Global periodic selection as the key event:
The random variable 
g indicates whether the key event was a global selective sweep (1 if yes, 0 else). The random variable
pgs indicates whether the lineages escape coalescence and begin



























