- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Toomajian, C.
- Articles by Kreitman, M.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Toomajian, C.
- Articles by Kreitman, M.
Sequence Variation and Haplotype Structure at the Human HFE Locus
Christopher Toomajiana and Martin Kreitmana,ba Committee on Genetics, University of Chicago, Chicago, Illinois 60637
b Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637
Corresponding author: Christopher Toomajian, University of Chicago, 1101 E. 57th St., Chicago, IL 60637., cmtoomaj{at}midway.uchicago.edu (E-mail)
Communicating editor: Y.-X. FU
| ABSTRACT |
|---|
The HFE locus encodes an HLA class-I-type protein important in iron regulation and segregates replacement mutations that give rise to the most common form of genetic hemochromatosis. The high frequency of one disease-associated mutation, C282Y, and the nature of this disease have led some to suggest a selective advantage for this mutation. To investigate the context in which this mutation arose and gain a better understanding of HFE genetic variation, we surveyed nucleotide variability in 11.2 kb encompassing the HFE locus and experimentally determined haplotypes. We fully resequenced 60 chromosomes of African, Asian, or European ancestry as well as one chimpanzee, revealing 41 variable sites and a nucleotide diversity of 0.08%. This indicates that linkage to the HLA region has not substantially increased the level of HFE variation. Although several haplotypes are shared between populations, one haplotype predominates in Asia but is nearly absent elsewhere, causing higher than average genetic differentiation among the three major populations. Our samples show evidence of intragenic recombination, so the scarcity of recombination events within the C282Y allele class is consistent with selection increasing the frequency of a young allele. Otherwise, the pattern of variability in this region does not clearly indicate the action of positive selection at this or linked loci.
HFE was the first gene to be associated with hereditary hemochromatosis, a recessive disease common in many populations of European descent and characterized by iron overload (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
A feature of HFE relevant to population genetic inference is its chromosomal context. HFE is found telomeric to the human leukocyte antigen (HLA) class-I region on chromosome 6p21 in an area referred to as the extended class-I region. The HLA has been the focus of polymorphism studies for decades, since the most polymorphic loci in the human genome are found here. Initial studies revealed that balancing selection has acted at a number of HLA genes (e.g., ![]()
![]()
![]()
![]()
4 Mb away from the highly polymorphic HLA-A locus, the genetic distance between them is
1 cM (![]()
![]()
![]()
Studies of HFE variation have focused on two amino acid polymorphisms that were discovered when the gene was mapped by ![]()
![]()
![]()
![]()
![]()
![]()
![]()
Several groups have proposed that selection has favored the C282Y mutation, but a detailed knowledge of the linked variation around this site is necessary to independently test this hypothesis at the nucleotide level. Two lines of evidence have led to the hypothesis of selection. One is based on the function of HFE, with the selective advantage for C282Y possibly stemming from its potential to prevent iron deficiency. The second is based on the seemingly incongruous observation that C282Y appears extremely young but is a relatively high-frequency mutation in European populations. Models of the decay of linkage disequilibrium (LD) over time estimate a young age (<100 generations) for the C282Y mutation (![]()
![]()
![]()
In this report we describe the nucleotide variation and haplotype structure in HFE for a worldwide population sample. We test whether the pattern of variability is consistent with an equilibrium neutral model of evolution. We compare the level of population variation and differentiation seen at HFE with similar studies of human genes. The partitioning of linked variation into haplotypes allows us to address the related questions of the origin of different alleles and the forces that have acted to produce their current global distribution and frequency.
| MATERIALS AND METHODS |
|---|
DNA samples:
A total of 30 samples (60 chromosomes) were chosen to represent ancestry from African, Asian, and European peoples. Identifiers in parentheses indicate the sample numbers from the Coriell Cell Repositories' National Institute of General Medical Sciences Human Genetic Mutant Cell Repository (http://arginine.umdnj.edu). Samples without these identifiers were either collected at the University of Chicago or provided by other labs. The 10 African samples include five Mbuti Pygmies (NA10492NA10496) from the Ituri forest in northeast Zaire and one sample each of Kikuya (NA00522), Ghanaian (NA02064A), Zulu (NA02476), !Kung (NA03043), and Luo (NA03190A) descent. The 10 Asian samples include five individuals of Chinese descent (including NA11321NA11323), two samples of Korean descent (including NA00726), and one sample each of Filipino (NA10798), Khmer (NA11373), and Vietnamese (NA03037) descent. The 10 European samples include four samples from the previous study of ![]()
![]()
PCR and sequencing:
The region under study consists of bases 43,38554,657 of the human hereditary hemochromatosis region (GenBank accession no.
U91328; ![]()
1 kb were sequenced to determine the identity of nucleotides 3111,244 (excluding external primer sequence). PCR products were used as templates for dRhodamine terminator cycle-sequencing reactions that were subsequently cleaned and run on an ABI 377 automated sequencer (Applied Biosystems, Foster City, CA). Chromatograms were imported into Sequencher v. 3.0 (Gene Codes, Ann Arbor, MI) for manual assembly of contigs and identification of polymorphic sites. Each base in the study was called, using at least single-fold coverage sequencing reads for each strand, except for a few small regions where sequence repeats made the reads in one direction of poor quality and bases were called using information primarily from one strand. Sequence from the HFE gene was also obtained from one common chimpanzee (Pan troglodytes) from DNA provided by Dr. D. H. Ledbetter. Most PCR and sequencing primers worked with the chimpanzee sample, and where gaps remained new PCR primers were designed.
For the remaining seven samples, a long-range PCR product of 11,273 bp was amplified with the Expand Long Template polymerase mix (Roche Molecular Biochemicals, Indianapolis), cloned with the Topo XL PCR cloning kit (Invitrogen, Carlsbad, CA), and the whole insert was sequenced directly. This method leads to the sequencing of PCR errors and may produce hybrid sequences derived from the maternal and paternal alleles. Therefore, we confirmed each difference from the reference sequence by either sequencing or DHPLC analysis (Transgenomic, Omaha) of smaller PCR products produced from genomic DNA.
Haplotype determination:
For the 4 samples from the ![]()
Data analysis:
The program DnaSP, ver. 3.53 (![]()
was calculated as described in ![]()
![]()
The mutational relationships among the experimentally determined haplotypes were visualized by using the reduced median (RM) algorithm of the program Network 2.0 (http://www.fluxus-engineering.com; ![]()
The program Arlequin 2.000 (![]()
CL (![]()
![]()
| RESULTS |
|---|
HFE diversity at the nucleotide level:
A total of 41 variable sites were identified in the 11,214-bp region of the 60 chromosomes surveyed: 38 diallelic single-nucleotide polymorphisms (SNPs), two triallelic SNPs, and one diallelic single-base indel polymorphism. The two SNPs found triallelic in the pooled sample indicate that at least two mutations have occurred at these sites in the history of this sample. Of the 41 polymorphic sites, 8 segregate singletons, with the more frequent allele matching the chimpanzee sequence in each case. Only two SNPs were in exons: SNP 6724, which causes the H63D amino acid polymorphism (![]()
|
Summary statistics describing the sequence diversity in the pooled and individual populations are presented in Table 1. Overall, average per-nucleotide expected heterozygosity,
w, for the total sample, estimated from the observed number of mutations (![]()
), an estimate of
based on the average pairwise sequence difference (![]()
increases by
8% to 0.091%. These estimates of
assume an infinite-sites model, which is clearly violated since the data contain two SNPs that are triallelic. However, estimates of
derived from finite-sites models lead to only negligible differences from the infinite-sites estimates for our data (![]()
distribution with parameter
= 0.1), the finite-sites estimates of
are not substantially changed (
increases from 0.084 to 0.085%). Our estimate of
for HFE is slightly lower than the average for fourfold degenerate coding sites in humans (0.11%; ![]()
![]()
|
Diversity in continental populations:
When the coding region is excluded,
increases by nearly the same percentage (
8%) for each population. As is commonly observed, the Africans have the highest nucleotide diversity and the largest number of population-specific SNPs, at 11. However, only 2 of these 11 are singletons, so that the value of
w, which can be greatly influenced by rare variants, does not rise above that of
. The nucleotide diversity for Europeans is only slightly lower than that for Africans, but they have many fewer population-specific SNPs than the Africans (4 vs. 11, Table 1). European HFE variability is not strictly a subset of that found in Africa, consistent with the previous finding of HFE polymorphisms with an apparent European origin (![]()
w relative to
.
Allele frequency spectrum:
Test statistics that utilize the frequency spectrum of alleles within a locus may detect departures from an equilibrium neutral model caused by demographic forces such as population growth, contraction, and subdivision or by the effect of diversifying or directional selection on linked sites. TAJIMA's (1989) D statistic compares the two estimates of
described above, while FU and LI's (1993) D statistic compares the number of singleton polymorphic sites with the estimate of
based on segregating sites and incorporates outgroup information to infer derived alleles. Additionally, FAY and WU's (2000) H statistic can detect departures in the frequency spectrum due to recent hitchhiking events. None of these three statistics, calculated for each population and for the pooled sample (Table 1), are significant at the 5% level, so the equilibrium model of neutral evolution cannot be rejected. However, it should be noted that the small size of the individual populations (n = 20) limits the power of these tests (![]()
Divergence from chimpanzee and haplotype variation:
The SNP alleles in the total sample of 60 chromosomes were found to occur in 18 distinct haplotypes (19 if the singleton indel is included). These haplotypes are displayed in Fig 1 along with a haplotype composed of the ancestral state of each allele inferred from chimpanzee (P. troglodytes). The chimpanzee sequence for the complete region revealed 71 fixed differences from human, including 69 SNPs, a 2-bp indel, and a complex mutation involving a base change and a single-base indel at a neighboring site. Only one fixed difference was found in the coding region, and this was a synonymous change. The average number of nucleotide substitutions per site between human and chimpanzee is 0.690% (0.750% for noncoding sequence). A recent analysis of divergence levels between humans and chimpanzee reports an average distance of 1.03 ± 0.04% for introns, from 32 loci, with a combined length >41 kb once repetitive regions were removed (![]()
The low observed divergence might result from a high average degree of constraint for the whole HFE region, which is composed primarily of introns, untranslated regions (UTRs), and intergenic sequence. To address this question, we investigated divergence from the homologous sequence in mouse. ![]()
![]()
In addition to functional constraint, a low neutral mutation rate could contribute to the low divergence. Both mutation rates and sequence divergence are influenced by G + C content (![]()
![]()
![]()
|
For every SNP in humans, one of the alleles was present in the homologous position in chimpanzee. In all but three cases, the more common SNP allele in the pooled sample corresponds to the inferred ancestral allele. For these exceptions (SNP alleles 11204C, 519A, and 7451A), the derived allele frequencies are 58, 67, and 72%, respectively. Inference of ancestral state based on only one outgroup can be incorrect, but at least 3 derived neutral alleles out of 42 are expected to have >50% frequency in a sample of this size. No haplotypes in humans have the same configuration at the 41 polymorphic sites as has the chimpanzee; that closest to this configuration is haplotype 9, found exclusively in Africans, which differs at 3 polymorphic sites and has an additional 71 fixed differences from the chimpanzee sequence.
Because the C282Y hemochromatosis mutation was not observed in the random population samples, we sequenced two Caucasians known to be C282Y homozygotes and observed that this mutation occurs on the haplotype 3 background. Haplotype 3 is found in all three continental populations and is the second most common haplotype in Europeans. No additional polymorphisms were discovered by sequencing the complete 11,214 bp of these two homozygotes (four chromosomes), consistent with the previous conclusion (e.g., ![]()
Number of haplotypes:
FU's (1997) Fs statistic compares the observed number of haplotypes in a sample to the number expected assuming an infinite-sites model of mutation under neutrality and no recombination and is useful for detecting population growth or hitchhiking. In each population as well as the pooled sample, no excess haplotype variation is observed at HFE, as seen by nonsignificant Fs values for all populations (see Table 1). In fact, in each case Fs is positive and therefore indicates a deficit of haplotypes given the observed level of nucleotide diversity. This finding is surprising, as both recurrent mutation and recombination have likely affected the samples, and both are expected to produce additional haplotypes. The deficit of haplotypes in this case suggests that either recombination and recurrent mutation have not increased the haplotype diversity of these samples greatly or other forces have kept the number of observed haplotypes low. However, STROBECK's statistic (1987), which tests for the opposite pattern of observing too few haplotypes, is also not significant for any population (P > 0.05). Again, no evidence for population expansion is apparent based on haplotype diversity in each population.
Haplotype network:
Fig 2 displays an RM network of haplotypes constructed from the pooled samples. Most haplotypes are unique to one particular population, since the relatively small sample size of each population reduces the chance of including rare haplotypes in each population sample. But other features, such as a branch that leads to four haplotypes found exclusively in Africans and representing 40% of all African samples, suggest the population differentiation seen in this sample may be real. The branch leading to the chimpanzee haplotype (Fig 2, arrow) contains the fixed differences as well as site 11,204, which is polymorphic in humans. The presence of recombinant haplotypes complicates the inference of haplotype relationships and results in mutations that have occurred only once in the history of a sample to be displayed multiple times in the network. Of the 43 mutations inferred from the human sample (including the indel), 10 are found twice on the network. Excluding the loop, 5 mutations show up twice on the network, indicating either recurrent mutations or recombination events. The chance of recurrent mutation is appreciable, since 144 CpG sites are found in the region studied and two nucleotide sites segregate three alleles each. In fact, CpG sites have a mutation rate that is estimated from our data to be >15 times higher than that of other nucleotide sites, similar to results for the LPL gene (![]()
|
Population subdivision:
To investigate population differentiation at HFE, we describe the unequal distribution of variation among populations. In the sample, 19 (44%) derived SNP alleles were found in all three populations, while 11 were restricted to African populations, 5 to Asian populations, and 4 to European populations (a total of 20 or 47% restricted to one population). Also, European and Asian populations shared 4 SNP alleles that were not found in Africa, while Africans did not share any SNP alleles with only Europeans or Asians. This supports the shared ancestry of the Asian and European samples after their split from African populations, which is also apparent since Europeans and Africans share 1 haplotype in common and Europeans and Asians share 2. When alleles are grouped together as haplotypes, each haplotype is expected to have a more limited distribution. We see 14 haplotypes (74%) unique to single populations (7 in Africa, 3 in Asia, and 4 in Europe) while all three populations share only 2 haplotypes.
WRIGHT's (1931) FST statistic serves to quantify population differentiation by expressing the genetic variance among populations divided by the genetic variance of the total population. Using an AMOVA analysis based on polymorphic sites and in which our samples were split into three continental groups (![]()
![]()
|
Recombination and linkage disequilibrium:
Since we have unambiguously determined the haplotypes at HFE, we are more confident in making conclusions about the evidence for recombination and LD in the region. When four gametic combinations are found in a sample of two-site haplotypes, this is an indication of recombination (crossing over between the two sites or gene conversion) or repeated mutation. Even with moderate mutation rate heterogeneity, one can assume the probability of repeated mutation at any one site is very small since this probability is no greater than that of a single mutation at the same site. A moderate level of recombination is consistent with the results of this four-gamete test, in which 25/528 site pairs (excluding singletons) have all four gametes found in the pooled sample (Table 4). A minimum number of recombination events (RM) can be inferred from the data to explain all instances of four gametes (![]()
|
Since RM gives only a lower bound on the amount of recombination, we can use other methods to estimate the population recombination parameter C (= 4Nr, where N is the effective population size and r is the per-locus recombination rate per generation). Estimators of C that use patterns of sequence variation can avoid inaccuracies due to local variation in recombination rates found in estimates based on observed crossing-over events between distant markers. CHRM, which uses the observed number of haplotypes and RM from data to estimate C under a model without gene conversion, performs well against other estimators of C (![]()
5 for HFE in different populations, although it is 0 in Africans, due to their lack of four-gamete site pairs (Table 1). ![]()
CL, which performs better than another commonly used estimator of C (![]()
![]()
CL estimate is not much greater than that of
CL = 0, indicating little evidence for recombination and consistent with the CHRM value. This result is unusual, as the study of ![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
A moderate level of LD is observed throughout the region, consistent with the estimates of the population recombination parameter. Due to the low frequency of most polymorphisms and our modest sample size, most pairwise LD comparisons (Table 4) do not have the power to detect significant LD at the 0.001 level (a value close to a global 0.05 level when multiple tests are considered). The pooled sample has the most power to detect LD, but pooling can also cause spurious LD due only to allele frequency differences between populations. D' values for all alleles that are found at least twice in each of a pair of populations are highly correlated between populations (African vs. European, r = 0.999; African vs. Asian, r = 0.888; Asian vs. European, r = 0.718), with no cases of significant LD in opposite directions for pairs of populations. The high correlations are not surprising since most pairs of alleles are in complete LD in each population.
Fig 3 shows the location of site pairs in significant LD for the European population. LD appears evenly distributed throughout the region, with a minor concentration of significant LD at the 5' end of the gene, particularly among two sites in the first intron and two sites in the 5' flanking region, which have a perfect association (sites 9013, 10,047, 10,701, and 11,204). Plots from the other populations (not shown) reveal a similar, even distribution of LD, providing no evidence of a recombination hotspot within this region. The rate of decay of LD with distance varies depending on the measure of LD used and the minimum frequency of variants included in the analysis. We have plotted |D'| from the pooled population vs. distance for variants at 25% frequency or greater (Fig 4). This plot indicates that LD falls to one-half of its maximum value at
6 kb, a rate of LD decline that lies in between that of the ACE and LPL regions (Fig 1 in ![]()
|
|
| DISCUSSION |
|---|
We have performed a full resequencing survey of nucleotide variation at the HFE locus using nonclinical samples from three major human population groups. The benefit of including noncoding variation in the study is that these polymorphisms are often more numerous and found at a broader range of frequencies, making them more informative in resolving which evolutionary forces have affected patterns of variation and governed the fate of alleles that alter the amount or function of a protein. Another significant aspect of our study is the experimental determination of haplotypes for an autosomal gene. These haplotypes provide a level of resolution greater than that of SNPs alone when drawing conclusions from genomic variability.
Summary of HFE variation:
The results of any survey of population variation can be roughly separated into three categories: the level of variation, the frequency spectrum of that variation, and the haplotype structure of the variants. These categories are not independent, as they all reflect the underlying population history of a sequence of DNA, but they do capture different aspects of the data. Before this study, we had little information about the level of nucleotide variation at HFE. Protein polymorphisms seemed few and rare, but this could be due to high conservation imposed by the protein function and the slightly deleterious nature of most amino acid substitutions. The HFE gene is thought to lie in or near a region of low recombination, as ![]()
![]()
![]()
![]()
Test statistics reveal no major deviations from the equilibrium neutral expectation in the frequency spectrum of SNP alleles, although when the haplotype structure is considered, the Asian population shows an unusual pattern that is not seen in Africans or Europeans and may be consistent with a founder effect or hitchhiking. Thus, HFE provides no evidence for the long-term growth of the human population. However, on the basis of their study, ![]()
HFE haplotype structure reveals some evidence for recombination, although fewer haplotypes than expected are observed on the basis of the number of variants. ![]()
![]()
![]()
![]()
![]()
![]()
Our study detects the second most common HFE variant, H63D, at a moderate frequency in Europeans. Other reports have found this variant outside of Europe at a frequency consistent with gene flow from the Mediterranean region. We find this mutation on two different haplotypes, consistent with the conclusion that this allele has an origin much older than that of C282Y. The fact that HFE variation fits an equilibrium neutral model does not conclusively resolve the question of the history and potential fitness effects of HFE amino acid polymorphisms. Amino acid polymorphisms represent only a small proportion of HFE variation. Additionally, if the C282Y allele has been the target of positive selection, it is still far from fixation in any population, and therefore we expect the signature of this selection will be very subtle. Diversifying selection rapidly changing the frequency of alleles at linked HLA loci could have affected the frequency of several alleles at the HFE locus. But after investigating the pattern of HFE haplotype diversity, only haplotype 1 in the Asian populations (discussed below) indicates the lack of variation relative to the frequency of an allele class expected under this type of scenario, providing little evidence that the C282Y allele has increased in frequency due to this effect. An obvious conclusion based on the identity between the most common human HFE variant and the sampled chimpanzee protein is that no directional selection has been acting on the protein over the past few million years, although this does not exclude the possibility of much more recent or weaker selection on polymorphism.
The effect of sampling strategy:
The structure of our population samples represents a compromise between sampling intensely from a few populations and broadly surveying many populations. For a number of analyses, samples are pooled on the basis of their continental origin as is frequently done for humans. This can have an effect on results when significant genetic subdivision exists among populations within a continent. Many genetic surveys are consistent with subdivision stronger in Africans than in Asians or Europeans (e.g., ![]()
![]()
![]()
![]()
Distribution of variation among continents:
Relative to other species, the amount of genetic diversity that exists between different human populations is low. ![]()
![]()
![]()
The comparison of variability in different populations can provide evidence for local adaptive evolution but can be complicated by population history. The greater variability of populations from sub-Saharan Africa is seen in almost every study, and African samples have been contrasted to non-African samples to reveal differences in their population history. In a recent study of >300 genes (![]()
When their haplotype structure is considered, the African samples are unusual. Although they do have the most haplotypes, this set of haplotypes is consistent with no recombination. Non-African populations show evidence of recombination, and this indicates that LD would decay over distance slower in the African population than in the other populations. Just the opposite is observed in other studies (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The Asian samples provide a more striking contrast to the other populations. They have the lowest variation, due to both fewer observed haplotypes and the low frequency of many SNPs. Over 60% of SNP alleles in Asians are rare (at 10% frequency or lower). Remarkably, when the direction of mutation is inferred from the chimpanzee sequence, nearly 30% of Asian-derived SNP alleles are at a frequency >60%. This proportion of high-frequency-derived alleles could result from a past hitchhiking event, although the H statistic does not reach significance. The low number of haplotypes observed in Asians and the predominance of haplotype 1 (rarely found outside of Asia) suggest that this haplotype has risen to a high frequency rapidly since the split of the Asian populations from the other populations studied. The large number of low-frequency variants could mean that, had a selective sweep occurred, it happened long enough ago that many new segregating sites have since arisen and thus decreased the power of the H test (![]()
The high frequency of haplotype 1 reveals the advantage of haplotype data over SNP sharing when assessing the similarity of populations. Haplotype 1 can explain a great deal of the population subdivision seen at HFE and can provide clues to how the interaction of selection with population history may have differed for individual populations. Haplotype 1 either originated within Asia and rose to a high frequency, spreading into Europe at low frequency, or occurred in a more ancient population but became prevalent in Asian lineages only through some sort of founder effect due to a bottleneck or hitchhiking. A broader sampling of populations could help resolve the history of this haplotype. Although not as informative as complete haplotype data, one SNP allele unique to haplotype 1 in our sample, SNP 4600G, has been genotyped in other populations. ![]()
![]()
Conclusions:
The discovery of these additional polymorphisms in the HFE region and the haplotypes they create may help identify other alleles that have an effect on the iron regulation phenotype. Several polymorphisms described in this study are found in the 3' UTR of the messenger RNA and could conceivably affect mRNA stability or levels of protein translation. Regulatory regions that affect levels of transcription can be found in introns or flanking a gene, where most of our polymorphisms are found. Also, tightly linked regulatory polymorphisms could be in disequilibrium with our observed haplotypes. The effects of these polymorphisms in regulatory regions are likely to be quite subtle, but they could help explain the finding of hemochromatosis in individuals that carry only one copy of a hemochromatosis-associated allele.
Finally, knowing more about the levels of variation and recombination at HFE will help evaluate the peculiar pattern of high frequency and young age estimated for the C282Y allele. Evidence for intragenic recombination at HFE has been sparse. Our estimates of local recombination rates based on sequence variation allow a reevaluation of the high LD around the C282Y allele, providing support for its young age. Although this pattern seems consistent with a selective advantage for C282Y, until an appropriate population genetic test for positive selection is performed, alternative causes of the pattern such as drift or a population bottleneck or growth cannot be ruled out. We have begun to use the polymorphisms and haplotypes reported here to study the extent of LD between HFE alleles besides C282Y and markers in the several megabases around HFE. By analyzing LD found around other HFE alleles, as well as around alleles produced by neutral coalescent simulations for different demographic parameters, we can evaluate the claim of positive selection on the C282Y allele relative to other possible explanations of the allele's high frequency given its young apparent age.
| FOOTNOTES |
|---|
The Pan troglodytes nucleotide sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession no.
AF447807. ![]()
| ACKNOWLEDGMENTS |
|---|
We thank all those who agreed to donate DNA for this study and Ami Rice for help collecting samples. Additionally, we thank R. Ajioka and L. Jorde for providing European DNA samples from hemochromatosis pedigrees; D. Ledbetter for the chimpanzee sample; J. Fay and R. Hudson for providing computer programs; E. Stahl, M. Fullerton, and members of A. Di Rienzo's laboratory for helpful discussions; J. Comeron for help with the analysis of divergence from chimp and mouse; and J. Comeron, A. Di Rienzo, K. Dyer, M. Hamblin, and two anonymous reviewers for helpful comments on the manuscript. This work was supported by National Institutes of Health grant GM39355 to M.K. and a National Science Foundation Doctoral Dissertation Improvement Grant (DEB-0073297) to C.T. and M.K. C.T. was partially supported by a Howard Hughes Medical Institute predoctoral fellowship and by National Institutes of Health training grant T32 GM07197 (genetics and regulation).
Manuscript received January 2, 2002; Accepted for publication May 3, 2002.
| APPENDIX |
|---|
|
| LITERATURE CITED |
|---|
AJIOKA, R. S., L. B. JORDE, J. R. GRUEN, P. YU, and D. DIMITROVA et al., 1997 Haplotype analysis of hemochromatosis: evaluation of different linkage-disequilibrium approaches and evolution of disease chromosomes. Am. J. Hum. Genet. 60:1439-1447.[Medline]
ARDLIE, K., S. N. LIU-CORDERO, M. A. EBERLE, M. DALY, and J. BARRETT et al., 2001 Lower-than-expected linkage disequilibrium between tightly linked markers in humans suggests a role for gene conversion. Am. J. Hum. Genet. 69:582-589.[Medline]
BANDELT, H. J., P. FORSTER, B. C. SYKES, and M. B. RICHARDS, 1995 Mitochondrial portraits of human populations using median networks. Genetics 141:743-753.[Abstract]
BARTON, J. C., R. SAWADA-HIRAI, B. E. ROTHENBERG, and R. T. ACTON, 1999 Two novel missense mutations of the HFE gene (I105T and G93R) and identification of the S65C mutation in Alabama hemochromatosis probands. Blood Cells Mol. Dis. 25:147-155.[Medline]
BEUTLER, E. and T. GELBART, 2000 A common intron 3 mutation (IVS3 -48c
g) leads to misdiagnosis of the c.845G
A (C282Y) HFE gene mutation. Blood Cells Mol. Dis. 26:229-233.[Medline]
BEUTLER, E. and C. WEST, 1997 New diallelic markers in the HLA region of chromosome 6. Blood Cells



