Genetics, Vol. 157, 317-330, January 2001, Copyright © 2001

Nucleotide Polymorphism and Natural Selection at the Pantophysin (Pan I) Locus in the Atlantic Cod, Gadus morhua (L.)

Grant H. Pogsona
a Department of Ecology and Evolutionary Biology and Institute of Marine Sciences, University of California, Santa Cruz, California 95064

Corresponding author: Grant H. Pogson, Department of Ecology and Evolutionary Biology, Earth and Marine Sciences Bldg., University of California, Santa Cruz, CA 95064., pogson{at}darwin.ucsc.edu (E-mail)

Communicating editor: W. F. EANES


*  ABSTRACT
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Molecular studies of nucleotide sequence variation have rarely attempted to test hypotheses related to geographically varying patterns of natural selection. The present study tested the role of spatially varying selection in producing significant linkage disequilibrium and large differences in the frequencies of two common alleles at the pantophysin (Pan I) locus among five populations of the Atlantic cod, Gadus morhua. Nucleotide sequences of 124 Pan I alleles showed strong evidence for an unusual mix of balancing and directional selection but no evidence of stable geographically varying selection. The alleles were highly divergent at both the nucleotide level (differing on average by 19 mutations) and at amino acid level (each having experienced three amino acid substitutions since diverging from a common ancestral allele). All six amino acid substitutions occurred in a 56-residue intravesicular loop (IV1 domain) of the vesicle protein and each involved a radical change. An analysis of molecular variation revealed significant heterogeneity in the frequencies of recently derived mutations segregating within both allelic classes, suggesting that two selective sweeps may be presently occurring among populations. The dynamic nature of the Pan I polymorphism in G. morhua and clear departure from equilibrium conditions invalidate a simple model of spatially varying selection.


STUDIES examining nucleotide sequence variation in natural populations have provided important insights into the role of natural selection in shaping the patterns of polymorphism within species and the patterns of divergence between species (HUDSON 1990 Down; KREITMAN 1991 Down; KREITMAN and AKASHI 1995 Down). When combined with genealogical information, data on the existing levels and distribution of nucleotide sequence variation among populations can provide unparalleled information on the past and present selective forces that may be acting at a locus. Evidence accumulating from Drosophila has suggested that natural selection has played an important role in affecting the patterns of nucleotide variation at a substantial fraction of loci (MORIYAMA and POWELL 1996 Down; HEY 1999 Down). However, the inability to reject the null hypothesis of no selection (i.e., neutrality) is not uncommon (see SCHAEFFER and MILLER 1992 Down; KLIMAN and HEY 1993 Down) and other factors, most notably the extent of recombination, exert strong effects on the standing levels of nucleotide variation (BEGUN and AQUADRO 1992 Down; AQUADRO et al. 1994 Down; CHARLESWORTH 1998 Down).

Few studies examining DNA sequence variation have attempted to test hypotheses related specifically to geographically varying patterns of selection. Most species are unlikely to experience similar selection pressures across their geographic ranges, and the extent to which selection can produce local adaptation at the molecular level, particularly in opposition to ongoing gene flow, remains poorly understood. The majority of studies that have examined spatial patterns of selection at the DNA level have focused on loci exhibiting clinal variation (e.g., BERRY and KREITMAN 1993 Down; KAROTAM et al. 1995 Down; KATZ and HARRISON 1997 Down; SCHULTE et al. 1997 Down). However, only the detailed molecular dissection of the Adh cline in Drosophila melanogaster by BERRY and KREITMAN 1993 Down explicitly tested the role of selection in maintaining clinal variation in the frequencies of the fast/slow polymorphic site by examining patterns of silent polymorphisms segregating within and between allelic groups.

Unlike the situation for clines, localized selection favoring different alleles in different environments may produce heterogeneous patterns, and loci exhibiting unusually high levels of variation might indicate the possible action of selection (CAVALLI-SFORZA 1966 Down). One such locus has been identified in the Atlantic cod, Gadus morhua (originally called GM798), that unlike other nuclear or mitochondrial markers exhibits significant differentiation among populations at large and small geographic scales (POGSON et al. 1995 Down; FEVOLDEN and POGSON 1997 Down; JONSDOTTIR et al. 1999 Down). This locus was also unusual in not showing a relationship between inferred levels of gene flow and geographic distance at large geographic scales (POGSON et al. 2001 Down) and in exhibiting nearly complete linkage disequilibrium among three restriction site polymorphisms in the gene region (POGSON and FEVOLDEN 1998 Down). The gene was originally identified as the cod synaptophysin (Syp I) locus by FEVOLDEN and POGSON 1997 Down but is more likely to represent a recently discovered cellular isoform of synaptophysin called pantophysin (HAASS et al. 1996 Down). Pantophysin is an integral membrane protein found in microvesicles of both neuroendocrine and nonneuroendocrine tissues that function in a variety of shuttling, secretory, and endocytotic recycling pathways (HAASS et al. 1996 Down; WINDOFFER et al. 1999 Down). Although the role of pantophysin in these pathways is poorly understood, its highly conserved structure of four transmembrane domains, two intravesicular loops, and two cytoplasmic tails allows all mutations identified at the molecular level to be localized to distinct domains.

The objective of the present study was to determine if geographically varying selection was acting at the pantophysin (Pan I) locus of G. morhua. To test this hypothesis, 124 Pan I alleles (1.85 kb in length) were sequenced from five populations distributed throughout the north Atlantic region. The levels of nucleotide polymorphism and spatial distribution of variable sites segregating within and among Pan I allelic classes were then compared among the populations. Three predictions of the variable selection hypothesis were tested. First, to account for the geographically varying selection, differences must exist between the two Pan I alleles at the nucleotide and/or amino acid levels. If a prolonged period of selection has favored different Pan I alleles in different regions then a genealogical signature of balancing selection may be present and statistical tests should reject neutral expectations. Second, if the selective regime has been stable over time and sufficient gene flow is occurring among populations, no differences should exist in the frequencies of neutral sites segregating within Pan I allelic classes because these would be invisible to selection (the "selective equivalence" test of BERRY and KREITMAN 1993 Down). Third, greater peaks of diversity within the region(s) of the Pan I locus experiencing selection should be present between, rather than within, populations (CHARLESWORTH et al. 1997 Down). Although the nucleotide sequence data provide strong support for a long-lived polymorphism at the Pan I locus, selection on recently derived mutations in both allelic classes appears responsible for the heterogeneity among populations thus invalidating a model of spatially varying selection.


*  MATERIALS AND METHODS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Samples:
Populations of G. morhua were sampled throughout the North Atlantic region and random subsamples were taken from these larger groups for sequencing. Subsamples from the NW Atlantic were randomly chosen from two large regional groups, Nova Scotia (NS) and Newfoundland (NF), as described in POGSON et al. 2001 Down. Subsamples taken from the Iceland (IC), Balsfjord (BA), and the Barents Sea (BS) populations are identical to those described in POGSON et al. 1995 Down.

Southern blot analyses:
Three restriction site polymorphisms in the vicinity of the Pan I locus (BstEII, DraI, and PstI) were scored in 998 individuals on Southern blots as described in POGSON et al. 1995 Down. A map of the Pan I gene region showing the locations of these polymorphic restriction sites is presented in Fig 1. The BstEIIB, DraIB, and PstIA "alleles" refer to the presence of sites for each enzyme and alternate alleles refer to their absence. Frequencies of the three polymorphic restriction sites and the resulting haplotypes are listed in Table 1 and form the basis of the sampling scheme outlined below.



View larger version (9K):
In this window
In a new window
Download PPT slide
 
Figure 1. Restriction map of the Pan I gene region showing the locations of polymorphic restriction sites. Exons are represented by solid boxes. The locations of the BstXI and SacII sites used to cut Pan IAIB heterozygotes prior to PCR and the positions of primers used for sequencing are shown below the coding region.


 
View this table:
In this window
In a new window

 
Table 1. Pan I restriction site and haplotype frequencies

PCR and DNA sequencing:
The cDNA clone representing the Pan I locus (GM798) was sequenced on an ABI Model 373 automated DNA sequencer. The full-length sequence was obtained from both strands using modified KS (5'-CGAGGTCGACGGTATCGATAAG-3') and SK primers (5'-TCTAGAACTAGTGGATCCCCCG-3') that flanked the EcoRI cloning site and two internal sequencing primers (B, 5'-TTGGTCCTCTATCTGGGCTTCG-3'; G, 5'-GTGCTACTATGCTTGTGGGGC-3'). Two PCR primers were then designed from the clone that amplified a 1.94-kb fragment from genomic DNA (4, 5'-CTTCCATTCATCCGAGTTCTG-3'; 7, 5'-CGTAGCAGAAGAGTGACACAT-3'). PCR reactions were performed in 20 M Tris-HCl (pH 8.8 at 25°), 10 mM KCl, 10 mM (NH4)2SO4, 2.5 mM MgSO4, 0.1% Triton X-100, 100 ng/µl bovine serum albumin, 200 µM each dNTP, 0.25 µM forward and reverse primers, 0.4 units of Taq 2000 DNA polymerase (Stratagene, La Jolla, CA), 0.4 units Taq extender PCR additive (Stratagene), and 100 ng template DNA in 10-µl sealed glass capillary tubes in an Idaho Technology (Idaho Falls, ID) A1605 air thermal-cycler. After an initial denaturation step of 45 sec at 94° the tubes were exposed to 35 cycles of denaturation at 94° for 1 sec, primer annealing at 52° for 1 sec, and primer extension at 72° for 1 min and 40 sec followed by a hold at 72° for 2 min. PCR products were visualized on 1% agarose gels stained with ethidium bromide.

The 1.94-kb Pan I genomic fragment was sequenced from individuals known to be homozygous for the polymorphic DraI restriction site located in the fourth intron of the gene (hereafter called the Pan IA and Pan IB alleles corresponding to the absence or presence of this site, respectively). Consensus restriction maps were then constructed from 4–5 homozygotes for both alleles, and mutations that produced unique restriction sites were identified. The presence of these sites allowed individual alleles to be amplified for sequencing from known Pan IAIB heterozygotes by digesting genomic DNA with the appropriate restriction enzyme before PCR. To amplify the Pan IA allele, heterozygotes were digested with BstXI (cutting the Pan IB allele at nucleotide position 646) prior to PCR. To amplify the Pan IB allele, digestions were performed with SacII (cutting the Pan IA allele at position 909) prior to PCR. Thirty-five cycles of PCR using BstXI-digested DNA as template and the two flanking PCR primers (4 and 7) resulted in the amplification of the Pan IA allele whereas SacII-digested DNA allowed preferential amplification of the Pan IB allele. To test the veracity of this method Pan IA and Pan IB alleles were amplified and sequenced in duplicate from two heterozygotes at two different dates. No differences among replicate sequences were detected.

Templates for sequencing were gel purified from 0.4% agarose gels and spun through spin columns containing 0.8 ml of Sephadex G-50. Complete sequences of both DNA strands were obtained from eight sequencing reactions per template. In addition to the two flanking primers, three additional forward (11, 5'-GCTGGATTTCCCGATGTTGATA-3'; 3, 5'-CGTTGGTCCTCTATCTGGGCTTC-3'; 23, 5'-GTTTCTCTGCAAGGATCTGTTTG-3') and reverse primers were used in sequencing (33, 5'-TCACAAATAGATCCTTGCAGAG-3'; 1, 5'-CGAAGAGTGGTTGCCAATAAGG-3'; 9, 5'-GCTGCATCAACCTAAAGTAGGAG-3'). Sequences were edited with SequenceNavigator, compiled into consensus sequences using AutoAssembler (both programs from Applied Biosystems, Foster City, CA), and aligned by eye. Nucleotide sequences have been deposited in GenBank under accession nos. AF288943, AF288944, AF288945, AF288946, AF288947, AF288948, AF288949, AF288950, AF288951, AF288952, AF288953, AF288954, AF288955, AF288956, AF288957, AF288958, AF288959, AF288960, AF288961, AF288962, AF288963, AF288964, AF288965, AF288966, AF288967, AF288968, AF288969, AF288970, AF288971, AF288972, AF288973, AF288974, AF288975, AF288976, AF288977.

Statistical analyses:
Samples of Pan IA and Pan IB alleles were obtained from five populations of G. morhua by randomly selecting 12 or 13 Pan IAIB heterozygotes previously identified from Southerns. For the Pan IB alleles this involved sampling only one haplotype (numbered 1 in Table 1). Because Pan IA alleles were distributed among three haplotypes (numbered 2, 3, and 5 in Table 1) samples of this allele were randomly selected from each population to ensure accurate representation of these haplotypes. Although this sampling protocol allows for statistical tests among Pan I allelic classes it is inappropriate for tests of neutrality that assume a random sampling of alleles. It also may not provide accurate estimates of nucleotide polymorphism in different populations because allele frequencies are extremely variable. To allow for comparisons of nucleotide variability among populations and to perform TAJIMA's (1989) and FU and LI's (1993) tests of neutrality, I followed the approach of HUDSON et al. 1994 Down and assembled 50 constructed random samples (CRSs) from each population. These were made by randomly subsampling Pan IA and Pan IB alleles from each population in proportion to their known frequencies. The CRS sizes were identical to the total number of Pan I alleles sequenced per population (24 or 26). The CRSs generated from each population were pooled to create 50 "global" constructed random samples that would have been representative of sampling across the entire species range.

Heterogeneity of Pan I allele and haplotype frequencies among populations were tested using FST estimates obtained from BIOSYS-1 (SWOFFORD and SELANDER 1989 Down). An analysis of molecular variation (AMOVA) was used to test for differences in the patterns of nucleotide polymorphism segregating within the Pan IA and Pan IB allelic classes between populations (EXCOFFIER et al. 1992 Down). PhiST statistics were estimated from p-distances among Pan IA and Pan IB haplotypes obtained from MEGA ver. 1.01 (KUMAR et al. 1993 Down) and were tested for significance by performing 5000 permutations of the null distributions of each variance component. Composite measures of linkage disequilibrium among restriction sites in the Pan I gene region were obtained using the LD86.FOR program of WEIR 1990 Down. D values were tested for significance by chi-square tests and standardized to D' values to allow comparisons among populations differing in allele frequencies. Mantel tests were done using the ISOLDE subprogram of GENEPOP ver. 1.2 (RAYMOND and ROUSSET 1995 Down).

Phylogenetic analyses of Pan I alleles were performed using the neighbor-joining algorithm of SAITOU and NEI 1987 Down implemented by MEGA and by maximum parsimony using PAUP ver. 3.1 (SWOFFORD 1993 Down). Both trees were rooted using the Greenland cod, Gadus ogac, as an outgroup. Estimates of nucleotide polymorphism (both {pi} and {theta}) present within Pan I alleles and in the CRSs were obtained using DnaSP ver. 2.2 (ROZAS and ROZAS 1997 Down). The DnaSP program was also used to perform TAJIMA's (1989) and FU and LI's (1993) tests of neutrality (the latter using G. ogac as the outgroup sequence). An intraspecific MCDONALD and KREITMAN 1991 Down test also was performed using DnaSP treating the two Pan I alleles as independent evolutionary lineages.


*  RESULTS
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Amino acid sequence and structure of cod pantophysin:
cDNA clone GM798 had an open reading frame of 222 amino acids and a 186-bp translated but untranscribed 3' tail. The gene was originally identified as the cod synaptophysin (Syp I) locus (FEVOLDEN and POGSON 1997 Down) but may represent the cod homologue of a recently discovered Syp I isoform called pantophysin (HAASS et al. 1996 Down). Both physins belong to a growing family of integral membrane proteins found in synaptic or cytoplasmic vesicles that are characterized by four membrane-spanning domains, two intravesicular loops, and two cytoplasmic tails (FERNANDEZ-CHACON and SUDHOF 1999 Down). Fig 2 presents the deduced amino acid structure of the cod physin aligned with pantophysin sequences from mouse and human (both from HAASS et al. 1996 Down) and the closely related synaptophysin sequences from both species (from SUDHOF et al. 1987 Down and GAITANOU et al. 1997 Down, respectively). Excluding the 27 amino acids missing from the amino terminus of the cod clone, amino acid identities between the G. morhua protein and pantophysin from mouse (49.8%) and human (50.5%) are only marginally higher than between the synaptophysins from both species (46.1 and 48.1%, respectively). Identities are highest in the four membrane-spanning domains (labeled M1–M4) and the charged residues that flank these regions as noted in previous studies (JOHNSTON et al. 1989 Down; COWAN et al. 1990 Down). The two intravesicular loops of the protein (called IV1 and IV2) and the short 3' cytoplasmic tail were highly diverged and difficult to align. Although the G. morhua physin is almost equally related to the two mammalian proteins, it is more likely to be pantophysin on the basis of (i) its truncated carboxy terminus (22 amino acids in length), which is lacking the characteristic proline- and tyrosine-repeating motifs present in all synaptophysins characterized to date and (ii) its isolation from liver tissue where synaptophysin expression is expected to be absent (except in nerve fibers).



View larger version (51K):
In this window
In a new window
Download PPT slide
 
Figure 2. Deduced amino acid structure of cod pantophysin (Pan I) aligned with mouse and human pantophysin and synaptophysin (Syp I) sequences. The complete 3' cytoplasmic tails of the mouse and human synaptophysins are not presented. Solid lines indicate positions of the four membrane-spanning domains (labeled M1–M4) and the dotted lines show the positions of the two intravesicular loops (IV1 and IV2) and the cytoplasmic tail (CYT2).

Linkage disequilibrium in Pan I gene region:
Table 2 presents estimates of linkage disequilibrium between three polymorphic restriction sites that span a 5.7-kb region of the Pan I gene region (see Fig 1). Highly significant disequilibrium was detected between all pairs of sites in all populations with the exception of the flanking BstEIIB and PstIA sites in Nova Scotia. This strong disequilibrium resulted in two common haplotypes to predominate in most populations (numbered 1 and 2 in Table 1). D values were consistent in sign across all populations and the standardized coefficients approached their maximum theoretical limits everywhere except the two NW Atlantic populations. In Nova Scotia, this was caused by the high frequency of one haplotype (numbered 3) formed by a recombination event between the DraIA and PstIA sites that had the effect of uncoupling the two flanking sites. In Newfoundland, this recombinant haplotype was less frequent and the disequilibrium between the two flanking restriction sites was diminished but still significant.


 
View this table:
In this window
In a new window

 
Table 2. Linkage disequilibrium in the Pan I gene region

Nucleotide polymorphism:
A total of 62 Pan IA and 62 Pan IB alleles were sequenced from five different populations of G. morhua. The gene region sequenced contained four exons (208 amino acids) and four introns whose locations were identical to those described in mammalian pantophysin and synaptophysin genes by HAASS et al. 1996 Down. Polymorphic sites are presented in Fig 3 (along with sequence from the outgroup G. ogac) and the levels of nucleotide polymorphism present within the Pan IA and Pan IB alleles in different populations are summarized in Table 3. All Pan IA alleles were 1851 bp in length with the exception of one allele (BA140-A), which contained three rather than two copies of a CA repeat. The Pan IB alleles were either 1845 or 1857 bp in length depending on the presence or absence of a 12-bp deletion at position 236 in the second intron.



View larger version (35K):
In this window
In a new window
Download PPT slide
 
Figure 3. Nucleotide polymorphism at the Pan I locus of G. morhua. Sequences are presented for the 34 unique G. morhua haplotypes and for the outgroup, G. ogac. Base positions of each variable site and the locations of introns (labeled I2–I5) and exons (labeled E3–E6) are indicated at the top of the figure. Four insertion/deletion mutations identified in the second intron are also shown: {triangledown}1, a single CA insertion; {triangledown}2, a 12-bp deletion of GCATAGTAAAAA; {triangledown}3, a 6-bp insertion of TGTTTT; {triangledown}4, a 6-bp insertion of TTTTTT. Amino acid replacement mutations have been underlined. BF, Balsfjord; BS, Barents Sea; IC, Iceland; NF, Newfoundland; NS, Nova Scotia.


 
View this table:
In this window
In a new window

 
Table 3. Nucleotide polymorphism in Pan IA and Pan IB alleles from different populations

A total of 52 polymorphic nucleotide sites were identified in the total sample. Twenty-six segregating sites (and one insertion) were detected in the sample of Pan IA alleles distributed among 25 haplotypes. In contrast, only 11 segregating sites (and one deletion) were found in the sample Pan IB alleles represented among 9 haplotypes. In the pooled sample the Pan IA alleles exhibited levels of nucleotide diversity ({pi}) and {theta} that were more than twice that observed for the Pan IB alleles (Table 3). Levels of nucleotide polymorphism varied considerably among populations from the NW and NE Atlantic. For the Pan IA alleles variability was lowest in Nova Scotia and Newfoundland and highest in the three NE Atlantic populations. The Pan IB alleles exhibited extremely low levels of polymorphism in the NE Atlantic but approached the levels of variability shown by the Pan IA alleles in the NW Atlantic. A negative relationship was seen between the levels of nucleotide diversity and the population frequency of both Pan IA alleles (r = -0.873, P = 0.053) and Pan IB alleles (r = -0.511, P = 0.379) but neither correlation was significant.

In contrast to the minimal variation present within the Pan IA and Pan IB allelic groups, 15 nucleotide mutations and a 6-bp insertion were fixed between the two alleles (Fig 3). The average number of nucleotide differences between any two randomly sampled Pan IA and Pan IB alleles (19.0) far exceeded that found within either allelic group (2.3 and 1.0, respectively). Because the majority of the variation was present between rather than within allelic classes, nucleotide diversity levels were strongly affected by the differences in allele frequencies among populations shown in Table 1. Estimates of nucleotide polymorphism in the five populations are presented in Table 4 from 50 constructed random samples that reflected a priori known differences in Pan I allele frequencies. Nucleotide diversity was highest in the three populations with intermediate frequencies of both alleles (Newfoundland, Iceland, and Balsfjord) and fell sharply in populations with high frequencies of either Pan IA (Nova Scotia) or Pan IB (Barents Sea). In contrast, {theta} was relatively invariant among populations because the number of segregating sites was largely determined by the presence of both alleles.


 
View this table:
In this window
In a new window

 
Table 4. Nucleotide polymorphism in the 50 constructed random samples

One-quarter of the polymorphisms detected in the study (13) fell within coding DNA and nine involved amino acid replacements (Table 5). Six of the nine replacement mutations were fixed between the two Pan I alleles (three within each allelic lineage) and all occurred within the first intravesicular (IV1) domain of the protein. Two codon positions (61 and 64) had each experienced two mutations so that at the protein level the two Pan I alleles differed by four amino acids. Based on the classification scheme of TAYLOR 1986 Down all nine amino acid replacement mutations were radical changes (six involving charged residues). Three replacement mutations were also detected segregating within Pan IA and Pan IB allelic groups. Two were singletons found in Norwegian waters (positions 92 and 214). However, the third mutation involved a very radical change (aspartic acid to lysine) in the IV1 domain of the protein and was detected in 22 of the 62 Pan IA alleles sequenced. This mutation (hereafter the Pan IA' allele) was fixed in the 19 haplotypes previously identified as recombinants between the DraIA and PstIA sites (haplotype 3 in Table 1) that were chosen for sequencing. It was also found in 3 of the 5 nonrecombinant Pan IA haplotypes (haplotype 2 in Table 1) sampled from Nova Scotia but not in the same haplotype sampled from any other population.


 
View this table:
In this window
In a new window

 
Table 5. Amino acid replacement mutations

The distribution of polymorphism across the Pan I gene region was examined by the sliding window approach of KREITMAN and HUDSON 1991 Down. Nucleotide diversity exhibited little heterogeneity across the pantophysin gene region when the Pan IA and Pan IB alleles were analyzed separately (Fig 4). However, when both alleles were included in the analysis two peaks of polymorphism were identified. The first peak corresponded to a 30-bp region in the second intron (positions 236 to 265) that was capable of forming a stem-loop structure in Pan IA alleles but had been disrupted by two insertion/deletion events in Pan IB alleles (see Fig 3). The second peak of polymorphism occurred in the region of the IV1 domain of the protein in the fourth exon that was segregating for six amino acid replacement mutations (positions 745 to 799 in Fig 3). When only silent positions were included in the sliding window analysis, the latter peak of polymorphism disappeared (not shown).



View larger version (25K):
In this window
In a new window
Download PPT slide
 
Figure 4. Sliding window analysis of nucleotide polymorphism across the Pan I gene region. Insertion/deletions in the second intron have been included as single mutational events. Analyses are presented for the 62 Pan IA alleles (dashed line), the 62 Pan IB alleles (dotted line), and the complete data set (solid line). The window size was 75 bp and the step size was 20 bp. The positions of introns and exons (labeled E3–E6) are shown at the top of the figure.

Phylogenetic analyses:
Genealogies of Pan I alleles were reconstructed by maximum parsimony and neighbor-joining approaches. A total of 32 parsimony-informative sites that produced a single most parsimonious tree of 74 steps with a consistency index of 0.987 were identified. The parsimony and neighbor-joining (NJ) trees were identical except for the position of a small subclade of Pan IA alleles (not shown) and the NJ tree is presented in Fig 5. The Pan IA and Pan IB alleles formed two highly distinct clades of closely related sequences each having 100% bootstrap support. The Pan IB clade was dominated by a group of 52 alleles that exhibited extremely low variability and 10 additional alleles that were restricted to the NW Atlantic region. The former group (hereafter called {triangledown}2 Pan IB alleles) was characterized by a 12-bp deletion in the second intron (position 236 in Fig 3) and two mutations in the fifth intron (positions 1580 and 1650 in Fig 3). The ancestral subclade of 10 Pan IB alleles from the NW Atlantic were identical to all Pan IA alleles at these two positions. The clade of Pan IA alleles was considerably more variable and possessed several subclades that exhibited limited geographic distribution. The most widely distributed subgroup was represented by the Pan IA' alleles characterized by the aspartic acid to lysine mutation in the IV1 domain. This mutation occurred at high frequencies in the NW Atlantic (0.687 in Nova Scotia and 0.320 in Newfoundland) but was rare in the NE Atlantic. Fig 5 also shows that the Pan IB alleles have experienced a faster rate of evolution than the Pan IA alleles. The genealogy underestimates the changes that have occurred in the lineage of Pan IB alleles because it does not include the insertions/deletions shown in Fig 3.



View larger version (19K):
In this window
In a new window
Download PPT slide
 
Figure 5. Neighbor-joining tree of 124 Pan I alleles. Numbers indicate the percentages of 100 bootstrap replicates supporting a specific clade. Bootstrap values below 60% are not shown. Clades corresponding to the Pan IA' and {triangledown}2 Pan IB alleles are marked. BF, Balsfjord; BS, Barents Sea; IC, Iceland; NF, Newfoundland; NS, Nova Scotia.

Differentiation among populations:
The frequencies of the three restriction sites scored in the vicinity of the Pan I locus exhibit highly significant differences among populations of G. morhua (POGSON et al. 1995 Down). FST values estimated for the BstEII, DraI, and PstI site polymorphisms in the five populations included in the present study are 0.229, 0.300, and 0.157, respectively. If Pan I haplotypes are considered instead of individual restriction sites FST is 0.229. All FST values are highly significant (P < 0.001). To examine whether heterogeneity existed among Pan IA and Pan IB allelic classes from different populations, an AMOVA was performed using p-distances estimated among haplotypes. Table 6 shows that significant differentiation was observed among populations for the variable sites identified within Pan IA and Pan IB allelic classes even though PhiST was low for both groups (0.119 and 0.152, respectively). This heterogeneity was caused by differences in the frequencies of the Pan IA' and {triangledown}2 Pan IB alleles described in the previous section.


 
View this table:
In this window
In a new window

 
Table 6. AMOVA results

Unlike the majority of nuclear restriction fragment length polymorphism (RFLP) loci examined in G. morhua, the individual restriction site polymorphisms scored in the Pan I gene region do not exhibit relationships between gene flow and geographic distance over the North Atlantic region (POGSON et al. 2001 Down). The slope of the regression of log (gene flow) vs. log (geographic distance) for the Pan I locus was positive, suggesting that populations sampled at greater geographic distances are genetically more similar than populations sampled at shorter geographic distances. However, this conclusion derived from analyses performed on the frequencies of single restriction sites among populations, not the relatedness of the alleles themselves. To examine whether allelic similarity was related to geographic distance, the average number of nucleotide substitutions per site (dXY) between Pan IA and Pan IB alleles sampled from different populations was regressed against the distance separating the populations. Strong positive correlations between (dXY) and distance were present for both Pan IA (r = 0.629, P = 0.047) and Pan IB alleles (r = 0.573; P = 0.066) although Mantel tests indicated that the relationship was significant only for the former. The positive relationships observed between allelic similarity and geographic distance for both Pan I alleles considered individually contrasts with the patterns exhibited by their population frequencies.

Tests of neutrality:
Results of Tajima's and Fu and Li's tests for neutrality on the 50 constructed random samples are presented in Table 7. Tajima's D statistic was negative in Nova Scotia and the Barents Sea (indicating an excess of low-frequency sites) but was not significant in 100 individual tests. Positive values of Tajima's D were found in the other populations but only Balsfjord produced a substantial number of significant test statistics (all 50 tests yielding P < 0.10 of which 32 were less than 0.05). Highly variable results were also observed for Fu and Li's D and F statistics. Some populations produced significant values for D but not F (Nova Scotia) and for F but not D (Balsfjord). The Iceland population produced a moderate number of positive tests for both statistics. Surprisingly, no significant test results were found in the 50 constructed random samples pooled from all five populations despite the fact that these samples were five times larger than the single population CRSs. Although the statistical meaning of these tests is unclear, the negative results obtained for the pooled CRS of 124 alleles was unexpected given the strong signal of selection in the data. An intraspecific McDonald and Kreitman test did, however, produce a significant result (Table 8) due to the proportion of fixed replacement differences between Pan I alleles (66.7%) being much higher than that of fixed silent differences (21.4%).


 
View this table:
In this window
In a new window

 
Table 7. Results of tests of neutrality on the 50 constructed random samples


 
View this table:
In this window
In a new window

 
Table 8. McDonald and Kreitman test


*  DISCUSSION
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

Spatial patterns of variation have commonly been used to identify genetic loci responding to some form of natural selection. In the present study, the pantophysin (Pan I) locus of G. morhua was chosen for molecular characterization because of its exceptionally high differentiation among populations (POGSON et al. 1995 Down) and the highly significant linkage disequilibrium present among three restriction site polymorphisms spanning the gene region (Table 2). Examination of nucleotide sequences of 124 Pan I alleles sampled from five populations of G. morhua has provided compelling evidence that this locus is indeed experiencing strong natural selection. The Pan IA and Pan IB alleles are highly differentiated at both the nucleotide level (differing on average by 19 mutations) and the protein level (each having undergone three amino acid substitutions since diverging from a common ancestral allele). All 6 replacement mutations cluster in the 56-amino-acid IV1 domain of the protein and each involves a radical substitution (Table 5). Recently derived mutations are also detected segregating with Pan IA and Pan IB allelic groups that exhibit significant heterogeneity among populations. Both mutations occur in regions of the gene exhibiting peaks of divergence between alleles, suggesting that historical and contemporary forms of selection acting at this locus are equivalent. The linkage disequilibrium and heterogeneity among populations thus do not appear to result from stable spatially varying selection but from the recent appearance and spread of selectively favored mutations in both allelic groups in different geographic areas.

There are two explanations for the large number of fixed differences detected between the two common Pan I alleles. One possibility is that the Pan I locus has experienced a prolonged period of balancing selection during which time recombination has played a minimal role in confounding the evolutionary histories of the two alleles. The other explanation is that the two alleles have spent most of their evolutionary histories in geographical isolation and have only recently been mixed together in extant populations. This "historical isolation" hypothesis can account for (i) the high divergence between alleles (i.e., strong directional selection favoring different mutations in different regions) and (ii) the strong linkage disequilibrium in the Pan I gene region (i.e., recombination has yet to break apart the historical associations). Although intuitively appealing, the historical isolation hypothesis makes two predictions that are not supported by the available data. First, it predicts that linkage disequilibrium should be common throughout the genome of G. morhua because all loci would have experienced similar histories of isolation. This prediction can be tested by examining linkage disequilibrium in the vicinity of two other nuclear loci (GM727 and GM842) scored for multiple restriction site polymorphisms by POGSON et al. 1995 Down. Table 9 shows that no detectable linkage disequilibrium exists at either RFLP locus. The historical hypothesis also predicts that RFLP alleles at loci other than Pan I should exhibit differentiation at the nucleotide level (albeit at reduced levels) even if recombination has broken apart associations assessed at larger distances. Preliminary sequence data collected for a 960-bp region of the G. morhua GM842 locus has found no differences among RFLP alleles (G. H. POGSON, unpublished data). These observations fail to support the historical isolation hypothesis and suggest that the Pan I locus has experienced a very different evolutionary history from other genes.


 
View this table:
In this window
In a new window

 
Table 9. Linkage disequilibrium in the GM842 and GM727 gene regions

Evidence that natural selection can act at the Pan I locus while both alleles coexist in the same population is provided by the distributions of recently derived mutations segregating within both allelic classes. These distributions suggest that two selective sweeps may be occurring among populations of G. morhua: the eastward movement of the Pan IA' allele and the westward spread of the {triangledown}2 Pan IB allele. The Pan IA' allele (having an aspartic acid to lysine mutation in the IV1 domain of the protein) probably originated in the Nova Scotia region where it is distributed among two haplotypes and occurs at high frequency (P = 0.687). The {triangledown}2 Pan IB allele (characterized by a 12-bp deletion in the second intron) is likely to have originated in the Barents Sea region where it is nearly fixed (P = 0.921). Two observations suggest that both alleles have recently displaced previously abundant alleles in their centers of origin. First, the Pan IA' and {triangledown}2 Pan IB alleles exhibit very low nucleotide diversities ({pi} = 0.0049 and 0.00020, respectively) compared to the inclusive allelic groups summarized in Table 3. Second, in geographic regions where the Pan IA' and {triangledown}2 Pan IB alleles are most abundant, the alternate alleles exhibit their highest nucleotide diversities. Pan IA alleles are most variable in the Barents Sea ({pi} = 0.00147) where they occur at a frequency of only 0.073. Similarly, Pan IB alleles are most polymorphic in Nova Scotia ({pi} = 0.00089) but are present at a frequency of only 0.100. These patterns are consistent with recent increases in the frequencies of selectively favored alleles at the expense of previously common alleles that had accumulated some silent polymorphism. Although this scenario hardly guarantees a stable balanced polymorphism, it suggests that evolutionary change can occur rapidly within both allelic groups without the need for geographic isolation.

A usual combination of balancing and directional selection is suggested from the genealogy of the two Pan I alleles shown in Fig 5. These two forms of selection are known to exert opposing effects on the predicted levels of silent polymorphism and the structures of allelic genealogies. Balancing selection is expected to significantly extend coalescence times (TAKAHATA 1990 Down; TAKAHATA and NEI 1990 Down) and lead to an accumulation of neutral polymorphism surrounding the site(s) affected by selection (STROBECK 1983 Down; HUDSON and KAPLAN 1988 Down). In contrast, directional selection is expected to shorten coalescence times and significantly reduce linked silent variation through hitchhiking effects (MAYNARD SMITH and HAIGH 1974 Down; KAPLAN et al. 1989 Down) that may extend large distances from the selected locus (see HUDSON et al. 1997 Down). If balancing selection is responsible for the long lineages of Pan I alleles, it has clearly not led to an elevation of linked silent polymorphism. The two allelic groups differ, on average, by only a few mutations and exhibit levels of nucleotide diversity well below that typically found at autosomal loci in Drosophila (reviewed by MORIYAMA and POWELL 1996 Down). The low within-allele diversity compared to the high between-allele divergence can only be explained by diversity-reducing processes like population bottlenecks, selective sweeps, or background selection (CHARLESWORTH et al. 1993 Down). Some support exists for selective sweeps as the cause of the reduced diversity because the amino acid substitutions required to purge linked silent polymorphism have occurred within both allelic groups.

The molecular evidence to date indicates that long-lived balanced polymorphisms are rare. Notable exceptions include the Mhc class I and II loci in vertebrates and S alleles in plants both of which possess a high number of alleles that commonly have long coalescent times that transcend species boundaries (reviewed by KLEIN et al. 1998 Down). If a prolonged period of balancing selection has been acting at the Pan I locus in G. morhua, it differs from Mhc and S loci by apparently favoring only two alleles that have each undergone repeated amino acid substitutions. Therefore, the mechanisms favoring high allelic diversity at Mhc and S loci do not appear to be operating at the pantophysin locus of G. morhua. The rapid turnover of alleles at the Pan I locus of G. morhua is, however, consistent with data accumulating from Drosophila suggesting that many balanced polymorphisms may have evolved recently and not accumulated silent polymorphism around the selected site like the Adh locus in D. melanogaster (KREITMAN and HUDSON 1991 Down). The list of allozyme polymorphisms studied at the molecular level in D. melanogaster that fail to show statistical support for persistent balancing selection include Est6 (COOKE and OAKESHOTT 1989 Down), Gpdh (TAKANO et al. 1993 Down), 6Pgd (BEGUN and AQUADRO 1994 Down), G6pd (EANES et al. 1993 Down, EANES et al. 1996 Down), Sod (HUDSON et al. 1994 Down, HUDSON et al. 1997 Down), and Tpi (HASSON et al. 1998 Down).

A striking feature of the genealogy of Pan I alleles is the apparent absence of intragenic recombination. Although this gene may occur in a region of low recombination, this result is surprising because recombination was detected among three polymorphic restriction sites scored in the vicinity of the Pan I locus (especially in the NW Atlantic where four recombinant haplotypes were present). No intragenic recombination events were detected within either allelic group. However, 1 Pan IA allele (NS28-A) had an A to G mutation at position 1407 that was fixed in all 62 Pan IB alleles (Fig 3). A group of 10 Pan IB alleles were also found in the NW Atlantic having mutations at positions 1580 and 1650 that were both fixed in Pan IA alleles. Both may represent cases of interallelic recombination or gene conversion in the 3' region of the gene. No recombination was detected at the 5' end of the gene where the pattern of fixed differences between alleles suggests that such events could be disadvantageous.

The two Pan I alleles are most highly differentiated in a 30-bp region of the second intron and a 54-bp region in the fourth exon where five of the six amino acid replacements have occurred. The intron region is capable of forming a stem-loop structure and thus may affect pre-mRNA stability and/or processing. All intron insertion/deletion mutations have occurred within the Pan IB lineage approximately 400 bp upstream from three amino acid changes. It is possible that epistatic natural selection is maintaining the association of the intron and amino acid mutations in the Pan IB alleles thus generating linkage disequilibrium. A similar link between intron and amino acid polymorphisms has been made for the Adh locus of D. melanogaster where a compound insertion/deletion mutation ({triangledown}1) in the first intron exists in linkage disequilibrium with the fast allele and exhibits parallel clinal variation with the mutation producing the fast/slow allozyme polymorphism in eastern North America (BERRY and KREITMAN 1993 Down). If epistatic selection is acting at the Pan I locus between the second intron and fourth exon it is too restrictive to explain the disequilibrium detected over the entire gene region. Only selective sweeps driven by mutations in the second intron and/or fourth exon, or perhaps at closely linked genes, can account for the linkage disequilibrium present in the Pan I gene region.

The biochemical basis of how natural selection may be acting at the Pan I locus of G. morhua is unknown. Pantophysin is a recently discovered cellular isoform of the neuroendocrine integral membrane protein synaptophysin (HAASS et al. 1996 Down). Using immunoelectron microscopy, pantophysin has been localized to small (<100 nm) cytoplasmic microvesicles that likely function in various membrane-trafficking pathways of various cell types (see HAASS et al. 1996 Down). The tissue-specific expression of pantophysin appears variable and not closely paralleled by other vesicle-associated membrane proteins such as VAMPS and SCAMPS (WINDOFFER et al. 1999 Down). Although nothing is known of the functioning of pantophysin in fishes, the differences detected between the Pan I alleles of G. morhua suggest that the polymorphism could be related to the differential expression and/or functioning of the protein in different tissues. This possibility can be tested by comparing the in situ levels and/or distribution of pantophysin in different tissues for different Pan I genotypes. The intravesicular loops of physins (notably synaptophysin) have not previously been identified as being important domains of the protein (see JOHNSTON et al. 1989 Down). However, the strong footprint of selection in the IV1 domain of pantophysin in G. morhua strongly suggests that it must be performing some critical function(s).

The form of balancing selection that could be operating at the Pan I locus is also unclear. The recent origin and spread of the Pan IA' and {triangledown}2 Pan IB alleles suggest that stable spatially varying selection is not favoring different alleles in different regions. Overdominance also appears unlikely because the Pan I locus was the only marker not to contribute to a correlation between DNA heterozygosity and growth rate in G. morhua (POGSON and FEVOLDEN 1998 Down) and because the near fixation of Pan I alleles in the Barents Sea, Nova Scotia, and the North Sea is inconsistent with a general fitness advantage expected for heterozygotes. Frequency-dependent selection may be operating at the Pan I locus but the mechanism(s) that would prevent complete fixation of alleles is unknown. The recent spread of selectively favored mutations in both allelic classes demonstrates the extremely dynamic nature of the Pan I polymorphism of G. morhua. Hitchhiking events that are not occurring uniformly across species ranges have also been described in D. melanogaster (BEGUN and AQUADRO 1993 Down) and D. ananassae (STEPHAN and MITCHELL 1992 Down).

Simulation studies have shown that TAJIMA's (1989) D and FU and LI's (1993) F and D statistics have reasonable power in detecting selective sweeps caused by the fixation of advantageous mutations (BRAVERMAN et al. 1995 Down; SIMONSEN et al. 1995 Down). Although the genealogies of both Pan I alleles show evidence for selective sweeps, both tests of neutrality failed to produce consistent results for the CRSs assembled from the five populations and neither test produced a single significant result for the pooled CRS of 124 alleles. This unexpected result was apparently caused by combining both Pan I alleles into the analyses, which had the effect of eliminating the signal from the data. When the alleles are considered separately, Tajima's test is significant for Pan IA (D = -1.883, P < 0.05) and nearly significant for Pan IB (D = -1.583, 0.10 > P > 0.05) while Fu and Li's D and F statistics are significant for both Pan IA alleles (D = -2.954 and F = -3.045, both P < 0.02) and Pan IB alleles (D = -2.952 and F = -2.916, both P < 0.02). These results demonstrate that strong departures from neutral genealogies need not necessarily produce significant test statistics. The significant McDonald and Kreitman test (Table 8) was caused by the long independent evolutionary histories of the two alleles, which behaved as if they were sampled from different species. If recombination between alleles had been more extensive, this test would likely have not been significant.

In summary, nucleotide sequence variation at the Pan I locus in G. morhua has provided strong evidence for an unusual mixture of balancing and directional selection. The significant linkage disequilibrium and large differences in the frequencies of Pan I alleles among populations do not appear to be caused by stable spatially varying selection but by the recent appearance and spread of selectively favored mutations in both allelic groups in different geographic areas. Although the two Pan I alleles have had long evolutionary histories, they have not accumulated polymorphism at linked silent sites because of repeated amino acid substitutions within each allelic lineage. The type of balancing selection that could be acting at the Pan I locus is presently unknown. However, the discovery of this polymorphism at 1 of the 11 anonymous cDNA-based RFLP markers initially chosen for population studies by POGSON et al. 1995 Down suggests that long-lived polymorphisms may be more common than previously believed.


*  ACKNOWLEDGMENTS

I thank C. T. Taggart, I. Hunt Von Herbing, M. Tupper, A. K. Danielsdottir, and S. E. Fevolden for their help in obtaining population samples, K. A. Mesa for technical assistance, and two anonymous reviewers for their helpful comments on improving the manuscript. Funding for the study was provided by the Ocean Production Enhancement Network (NSERC Canada) and by the University of California.

Manuscript received September 24, 1999; Accepted for publication September 25, 2000.


*  LITERATURE CITED
*TOP
*ABSTRACT
*MATERIALS AND METHODS
*RESULTS
*DISCUSSION
*LITERATURE CITED

AQUADRO, C. F., D. J. BEGUN and E. C. KINDAHL, 1994 Selection, recombination and the levels of DNA polymorphism in Drosophila, pp. 45–56 in Non-neutral Evolution: Theories and Molecular Data, edited by G. B. GOLDING. Chapman & Hall, New York.

BEGUN, D. J. and C. F. AQUADRO, 1992  Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster.. Nature 356:519-520[Medline].

BEGUN, D. J. and C. F. AQUADRO, 1993  African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 356:519-520.

BEGUN, D. J. and C. F. AQUADRO, 1994  Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of Drosophila: selection and geographic differentiation. Genetics 136:155-170[Abstract].

BERRY, A. and M. KREITMAN, 1993  Molecular analysis of an allozyme cline: alcohol dehydrogenase in Drosophila melanogaster on the east coast of North America. Genetics 134:869-893[Abstract].

BRAVERMAN, J. M., R. R. HUDSON, N. L. KAPLAN, C. H. LANGLEY, and W. STEPHAN, 1995  The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783-796[Abstract].

CAVALLI-SFORZA, L., 1966  Population structure and human evolution. Proc. R. Soc. Lond. Ser. B Biol. Sci. 164:362-379[Medline].

CHARLESWORTH, B., 1998  Measures of divergence between populations and the effects of forces that reduce variability. Mol. Biol. Evol. 15:538-543[Abstract].

CHARLESWORTH, B., M. T. MORGAN, and D. CHARLESWORTH, 1993  The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289-1303[Abstract].

CHARLESWORTH, B., M. NORDBORG, and D. CHARLESWORTH, 1997  The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet. Res. 70:155-174[Medline].

COOKE, P. H. and J. G. OAKESHOTT, 1989  Amino acid polymorphisms for esterase-6 in Drosophila melanogaster.. Proc. Natl. Acad. Sci. USA 86:1426-1430[Abstract/Free Full Text].

COWAN, D., M. LINIAL, and R. H. SCHELLER, 1990  Torpedo synaptophysin: evolution of a synaptic vesicle protein. Brain Res. 509:1-7[Medline].

EANES, W. F., M. KIRCHNER, and J. NOON, 1993  Evidence for adaptive evolution of the G6pd gene in the Drosophila melanogaster and Drosophila simulans lineages. Proc. Natl. Acad. Sci. USA 90:7475-7479[Abstract/Free Full Text].

EANES, W. F., M. KIRCHNER, J. NOON, C. H. BIERMANN, and I.-N. WANG et al., 1996  Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans.. Genetics 144:1027-1041[Abstract].

EXCOFFIER, L., P. E. SMOUSE, and J. M. QUATTRO, 1992  Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479-491[Abstract].

FERNANDEZ-CHACON, R. and T. C. SÜDHOF, 1999  Genetics of synaptic vesicle function: toward the complete functional anatomy of an organelle. Annu. Rev. Physiol. 61:753-776[Medline].

FEVOLDEN, S. E. and G. H. POGSON, 1997  Genetic divergence at the synaptophysin (Syp I) locus among Norwegian coastal and north-east Arctic populations of Atlantic cod. J. Fish Biol. 51:895-908.

FU, Y.-X. and FU, Y.-X.W-H. LI, 1993  Statistical tests of neutrality of mutations. Genetics 133:693-709[Abstract].

GAITANOU, M., A. MALAKI, E. MERKOURI, and R. MATSAS, 1997  Purification and cDNA cloning of mouse BM89 antigen shows that it is identical with the synaptic vesicle protein synaptophysin. J. Neurosci. Res. 48:507-514[Medline].

HAASS, N. K., J. KARTENBECK, and R. E. LEUBE, 1996  Pantophysin is a ubiquitously expressed synaptophysin homologue and defines constitutive transport vesicles. J. Cell Biol. 134:731-746[Abstract/Free Full Text].

HASSON, E., I.-N. WANG, L.-W. ZENG, M. KREITMAN, and W. F. EANES, 1998  Nucleotide variation in the triosephosphate isomerase (Tpi) locus of Drosophila melanogaster and Drosophila simulans.. Mol. Biol. Evol. 15:756-769[Abstract].

HEY, J., 1999  The neutralist, the fly and the selectionist. Trends Ecol. Evol. 14:35-38[Medline].

HUDSON, R. R., 1990 Gene genealogies and the coalescent process, pp. 1–44 in Oxford Surveys in Evolutionary Biology, edited by D. FUTUYMA and J. ANTONOVICS. Oxford University Press, Oxford.

HUDSON, R. R. and N. L. KAPLAN, 1988  The coalescent process in models with selection and recombination. Genetics 120:831-840[Abstract/Free Full Text].

HUDSON, R. R., K. BAILEY, D. SKARECKY, J. KWIATOWSKI, and F. J. AYALA, 1994  Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster.. Genetics 136:1329-1340[Abstract].

HUDSON, R. R., A. G. SÁEZ, and F. J. AYALA, 1997  DNA variation at the Sod locus of Drosophila melanogaster: an unfolding story of natural selection. Proc. Natl. Acad. Sci. USA 94:7725-7729[Abstract/Free Full Text].

JOHNSTON, P. A., R. JAHN, and T. C. SÜDHOF, 1989  Transmembrane topography and evolutionary conservation of synaptophysin. J. Biol. Chem. 264:1268-1273[Abstract/Free Full Text].

JONSDOTTIR, O. D. B., A. K. IMSLAND, A. K. DANIELSDOTTIR, V. THORSTEINSSON, and G. NAEVDAL, 1999  Genetic differentiation among Atlantic cod in south and south-east Icelandic waters: synaptophysin (Syp I) and haemoglobin (HbI) variation. J. Fish Biol. 54:1259-1274.

KAPLAN, N. L., R. R. HUDSON, and C. F. LANGLEY, 1989  The "hitchhiking effect" revisited. Genetics 123:887-899[Abstract/Free Full Text].

KAROTAM, J., T. M. BOYCE, and J. G. OAKESHOTT, 1995  Nucleotide variation at the hypervariable esterase 6 isozyme locus of Drosophila simulans.. Mol. Biol. Evol. 12:113-122[Abstract].

KATZ, L. A. and R. G. HARRISON, 1997  Balancing selection on electrophoretic variation of phosphoglucose isomerase in two species of field cricket: Gryllus veletis and G. pennsylvanicus.. Genetics 147:609-620[Abstract].

KLEIN, J., A. SATO, S. NAGL, and C. O'HUIGIN, 1998  Molecular trans-species polymorphism. Annu. Rev. Ecol. Syst. 29:1-21.

KLIMAN, R. M. and J. HEY, 1993  DNA sequence variation at the period locus within and among species of the Drosophila melanogaster complex. Genetics 133:375-387[Abstract].

KREITMAN, M., 1991 Detecting selection at the level of DNA, pp. 204–221 in Evolution at the Molecular Level, edited by R. K. SELANDER, A. G. CLARK and T. S. WHITTAM. Sinauer, Sunderland, MA.

KREITMAN, M. and H. AKASHI, 1995  Molecular evidence for natural selection. Annu. Rev. Ecol. Syst. 26:403-422.

KREITMAN, M. and R. R. HUDSON, 1991  Inferring the evolutionary history of the Adh and Adh-dup loci in Drosophila melanogaster from patterns of polymorphism and divergence. Genetics 127:565-582[Abstract].

KUMAR, S., K. TAMURA and M. NEI, 1993 MEGA: Molecular Evolutionary Genetic Analysis, version 1.02. Pennsylvania State University, University Park, PA.

MAYNARD SMITH, J. and J. HAIGH, 1974  The hitch-hiking effect of a favorable gene. Genet. Res. 23:23-35[Medline].

MCDONALD, J. H. and M. K