- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Yang, Z.
- Articles by Rannala, B.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Yang, Z.
- Articles by Rannala, B.
Likelihood Models of Somatic Mutation and Codon Substitution in Cancer Genes
Ziheng Yangb, Simon Roa, and Bruce Rannalaaa Department of Medical Genetics, University of Alberta, Edmonton, Alberta T6G 2H7, Canada
b Department of Biology, Galton Laboratory, University College London, London WC1E 6BT, England
Corresponding author: Bruce Rannala, 8-39 Medical Sciences Bldg., University of Alberta, Edmonton, AB T6G 2H7, Canada., brannala{at}ualberta.ca (E-mail)
Communicating editor: S. TAVARÉ
| ABSTRACT |
|---|
The role of somatic mutation in cancer is well established and several genes have been identified that are frequent targets. This has enabled large-scale screening studies of the spectrum of somatic mutations in cancers of particular organs. Cancer gene mutation databases compile the results of many studies and can provide insight into the importance of specific amino acid sequences and functional domains in cancer, as well as elucidate aspects of the mutation process. Past studies of the spectrum of cancer mutations (in particular genes) have examined overall frequencies of mutation (at specific nucleotides) and of missense, nonsense, and silent substitution (at specific codons) both in the sequence as a whole and in a specific functional domain. Existing methods ignore features of the genetic code that allow some codons to mutate to missense, or stop, codons more readily than others (i.e., by one nucleotide change, vs. two or three). A new codon-based method to estimate the relative rate of substitution (fixation of a somatic mutation in a cancer cell lineage) of nonsense vs. missense mutations in different functional domains and in different tumor tissues is presented. Models that account for several potential influences on rates of somatic mutation and substitution in cancer progenitor cells and allow biases of mutation rates for particular dinucleotide sequences (CGs and dipyrimidines), transition vs. transversion bias, and variable rates of silent substitution across functional domains (useful in detecting investigator sampling bias) are considered. Likelihood-ratio tests are used to choose among models, using cancer gene mutation data. The method is applied to analyze published data on the spectrum of p53 mutations in cancers. A novel finding is that the ratio of the probability of nonsense to missense substitution is much lower in the DNA-binding and transactivation domains (ratios near 1) than in structural domains such as the linker, tetramerization (oligomerization), and proline-rich domains (ratios exceeding 100 in some tissues), implying that the specific amino acid sequence may be less critical in structural domains (e.g., amino acid changes less often lead to cancer). The transition vs. transversion bias and effect of CpG dinucleotides on mutation rates in p53 varied greatly across cancers of different organs, likely reflecting effects of different endogenous and exogenous factors influencing mutation in specific organs.
IN the last two decades, many genes that display a tendency to undergo somatic mutation in various cancers have been identified. As a result, the connection between somatic gene mutation and cancer initiation, and progression, is now much better understood (reviewed in ![]()
![]()
![]()
![]()
Genes whose normal role is to prevent damaged cells from escaping regulation and that undergo inactivation of both alleles during tumor development are called tumor suppressor genes. The loss of function of a tumor suppressor gene is typically caused by a nonsense or missense mutation, a chromosomal deletion, methylation, etc. The role of the normal tumor suppressor gene in noncancerous cells has, in several cases, become clearer following its discovery: p53 arrests the cell cycle at G1 phase, or induces apoptosis, in cells with damaged DNA (![]()
![]()
![]()
The availability of large databases of tumor mutations has enabled cancer biologists to compare frequencies of mutations in different functional domains of a gene and in different tissues. Such studies can potentially clarify the role of these domains in gene function as it relates to cancer. Moreover, the existence of such databases has stimulated a search for mutational hotspots that may be caused by features of the primary sequence (i.e., CpG dinucleotides, etc.) that make a region more susceptible to mutation. Comparative studies of homologous genes across species revealed that highly conserved regions (i.e., regions under strong negative selection) coincide with mutational hotspots in the mutational database (![]()
![]()
Yet another approach for studying cancer mutations examines the frequencies of germline mutations in a population, testing the fit of alternative population genetic models assuming either neutral evolution or positive or negative selection. ![]()
Features of the mutation process should be reasonably well described by a nucleotide-based approach; biases in rates of substitution at particular dinucleotides, for example, can be indicative of exogenous vs. endogenous mutagens. One might also expect the mutational spectrum to differ among tumors from different tissues because some organs, such as skin or lung, may be exposed to exogenous (environmental) mutagens (e.g., UV light and tobacco smoke) more heavily than others such as brain (![]()
![]()
![]()
Studies that have examined the spectrum of codon substitutions in cancer gene databases, such as the p53 database, have generally used the simple approach of counting the frequencies of missense, nonsense, or silent substitutions observed at a site (e.g., ![]()
![]()
![]()
Modeling rates of somatic codon substitution in tumor development over an individual's lifespan is in many ways similar to modeling rates of germline codon substitution among species over evolutionary time. The problem, in that context, is to estimate relative rates of missense vs. silent substitution among sites in a comparative analysis of genes from different species (![]()
| METHODS |
|---|
Let Y = {Yl} be the codon sequence of the normal gene, where Yl is the codon at site l as determined from the reference sequence and l ranges from 1 to L, where L is the total number of codons in the gene. Let X = {Xij}, where Xij is the number of sampled tumor gene sequences with a single-nucleotide substitution replacing normal codon i with mutant codon j (for all j
i). Note that i, j, and Yl are each 1 of 64 possible distinct codons with the constraint that one nucleotide difference separates i and j. For example, j = {AAG} and i = {ATG}. The codon substitution process acting on a cancer gene in the somatic cells of an individual that will ultimately develop a tumor is modeled as a continuous-time Markov process. The instantaneous substitution rate of this process will depend on many factors such as the rate of mutation to different nucleotides, the selective advantage to tumorigenesis (in promoting cell division, etc.) of cells carrying particular mutant forms of the gene, and so on. In this article, we consider several simple models that incorporate some of these influences. It is shown that the details of the demographic process of cancer cell proliferation can be ignored if we condition on a single-nucleotide substitution having occurred in a given cancer cell lineage. This assumption is satisfied for most of the tumors in the p53 database that we use to illustrate the method.
Constant rates model:
To model nucleotide mutation we initially use a model with two parameters to describe the nucleotide mutation process: the average rate of mutation per site, µ, and the ratio of transitions to transversions,
. To model codon substitution, we use a model with three parameters,
where these are the probabilities that a newly arisen synonymous, missense, or nonsense mutation, respectively, ultimately becomes fixed in a tumor lineage. We refer to this as model M0, or the constant rate (CR) model because it assumes that the same mutation and substitution rates apply across all functional domains of a gene and across all primary tumor tissue types.
Let
, where qij is the instantaneous rate of substitution from codon i to codon j, qi =
j
iqij and qii = -qi. The off-diagonal elements of Q are products of the instantaneous nucleotide mutation rates and codon fixation probability. For example, if i = {TCG} and j = {TTG}, then qij = µ
/(2 +
) x ßM. Define mj =
i I(Yi, j), where I(Yi, j) equals 1 if Yi = j and 0 otherwise (i.e., the number of codons in the normal sequence that are of type j). It is assumed that each codon undergoes an independent substitution process. A Markov process can be uniquely characterized as a sojourn process (![]()
![]() |
(1) |
If a substitution event occurs, and the initial state is i, it is a substitution to state j with probability qij/qi. The waiting time, ti, until the first substitution at any site bearing codon i in the normal sequence is then the smallest order statistic of mi iid exponential random variables with common parameter qi. The pdf of ti is
![]() |
(2) |
The density function is the same for sites bearing any other codon l
i in the normal sequence provided that qi and mi are replaced by ql and ml. The first substitution occurs at a site with codon i in the normal sequence if ti < tj for all j
i. The joint density of ti and ti < tj for all j
i is
![]() |
(3) |
The marginal probability that ti < tj, averaged over all possible values of ti, is
![]() |
(4) |
If it is assumed that no more than one codon substitution has occurred in a gene in the development of a particular tumor lineage, then the probability of a change from codon i to j is the probability that a substitution occurs at a site with codon i in the normal sequence (given by Equation 4 above) multiplied by the probability of a transition from i to j, given that a substitution has occurred, which is qij/qi as noted above. Thus, the probability that one substitution occurs from i to j is
![]() |
(5) |
Because both qij and qj are linear functions of µ, the mutation rate cancels out. The remaining parameters
and
can be estimated from the data using maximum likelihood. The substitution probabilities always occur in ratios in Equation 5, so one of these parameters is not identifiable. We instead estimate the three identifiable parameters
N = ßN/ßS,
M = ßM/ßS, and
. The likelihood function is
![]() |
(6) |
We used numerical methods to maximize the log-likelihood function (log L) with respect to these parameters, where log L is
![]() |
(7) |
The CR model assumes that rates of nucleotide mutation and codon substitution are identical across nucleotides over the entire coding region of the gene.
Variable rates models:
The CR model M0 presented above can be readily extended to develop a hierarchy of variable rates models; here we present several models that allow rates of substitution to vary across known functional domains of a tumor suppressor gene (or oncogene) and/or across tumors of different tissues. Moreover, models that allow mutation rates to be influenced by the primary nucleotide sequence, for example, to account for the well-known influence of CpG dinucleotides on mutation rates, are considered (see ![]()
![]()
|
Model M1 allows the relative rates of missense and nonsense substitution to vary across functional domains. We define
N = {
N(i)} and
M = {
M(i)}, where
N(i) is the ratio of nonsense to silent substitutions in the ith functional domain, etc. Model M1 retains a common transition/transversion bias,
, and a common mutation rate, µ, across functional domains. If a gene has n functional domains, there are 2n + 1 parameters under this model because µ cannot be estimated from the data if we condition on a single substitution having occurred in each sampled tumor. Model M2 is similar, but allows the transition/transversion bias parameter,
(i), to also vary across regions. We define
= {
(i)}, where
(i) is the transition/transversion ratio for the ith functional domain. If a gene has n functional domains, there are 3n parameters under this model.
Model M3 assumes constant rates across functional domains but adds an additional parameter
CG that is the relative rate of substitution at CG dinucleotides vs. non-CG sites. The dinucleotide model considers the substitution rate of a "quintet," which includes the nucleotide before the first codon position, the codon itself, and the nucleotide after the third codon position. If a mutation changes a quintet with no CpG into a quintet with CpG (for example, "T TCT A" changing into "T TCG A"), the substitution rate is divided by
CG. If a mutation changes a quintet with a CpG into a quintet without, the substitution rate is multiplied by
CG. If the source and target quintets either both lack or both contain CpG doublets, the rate is not changed. Model M4 allows
M(i) and
N(i) to vary across functional domains (as in M1) but adds an additional parameter
CG that is assumed to be constant across domains. Model M5 extends model M4 by allowing transition/transversion ratios to vary across functional domains; model M5 is identical to M2 apart from the additional parameter
CG. Model M6 adds n additional parameters to model M5, allowing the relative rate of silent substitution to vary across functional domains; we define
S = {
S(i)}, where
S(i) is the silent substitution rate of functional domain i relative to functional domain 1, for all i
1. In addition, we add a parameter
PY that is the relative substitution rate for nucleotides that occur as dipyrimidines vs. those that do not (using a quintet codon model of the same form as was used to model
CG). Model M7 extends model M5 by adding the
PY parameter. Model M8 is the most parameter-rich model we consider. This model extends M7 by allowing
M,
N, and
CG to vary across primary tumors from different tissues, adding 2(H + 2)(n + 2) additional parameters, where H is the number of different primary tumor tissues stratified in the database.
| ANALYSIS |
|---|
The p53 tumor suppressor protein was originally identified in several independent studies in 1979 both as a protein that interacts with SV40 virus large T antigen (![]()
![]()
![]()
![]()
![]()
![]()
![]()
For our analysis, we used release 5 of the p53 database (http://www.iarc.fr/P53/). This database contained a total of 15,121 tumor entries as of July 1, 2001. The 11 exons of the p53 gene contain 1179 nucleotides coding for 393 amino acids. In total, 222 of the 393 codons have thus far been observed to be targets of mutation in cancer. Mutations include insertions or deletions of nucleotides (most often resulting in a frameshift), as well as point mutations. Because our models condition on a single-point mutation having occurred (in an exon), prior to our analysis we removed sequences from the database that contained insertions, deletions, mutations in introns, or multiple-point mutations. This reduced the total number of tumor entries used in our analysis to 12,759. There are six recognized functional domains in the p53 gene (see Fig 2) but the boundaries of the domains described in the literature often differ by several amino acids (see, e.g., ![]()
![]()
![]()
|
The transcriptional activation domain (residues 140) interacts with the basal transcriptional machinery (e.g., RNA polymerase, other transcription factors, etc.), activating transcription of its target genes; the proline-rich domain (residues 6798) is involved in the binding of p53 to the nuclear matrix and may play a role in stimulating apoptosis in cells with irreversible DNA damage (![]()
![]()
![]()
![]()
Model M8 partitions the parameter estimates according to primary tumor tissue type as well as functional domain. To carry out this analysis, we partitioned the data according to the source of the primary tumor as documented in the database. We combined mutations from samples obtained from both surgeries and established cell lines. Twelve primary cancers are each represented by >600 samples in the database and to maintain large sample sizes we chose to partition by these categories only. These 12 cancers accounted for 9886 of the single-point mutation entries; the remaining 2873 cancers in the database caused by a single-point mutation were too rare for separate analyses and were instead analyzed collectively in a composite category labeled as "other cancers." The 12 cancer categories are listed in Table 3 and Table 4.
|
|
|
|
| RESULTS |
|---|
The results of likelihood-ratio tests comparing all eight models are shown in Table 1. All of the increasingly complex models that we examined resulted in a significant improvement in the fit of the model to the p53 mutation data. The greatest improvements are obtained by partitioning rates according to functional domains and allowing higher rates of substitution at CG dinucleotides (see Fig 2 and Table 1). The most complex models considered (with or without constant rates across tumor tissues) are preferred over the remaining submodels for parameter estimation because all result in a significant improvement in the fit of the models to the data. We also used the Akaike information criterion (AIC; ![]()
Table 2 shows the results for analyses of the p53 database using models M6 and M7. The only difference between these two models is that M7 allows the relative rate of silent substitution,
S, to vary across domains, whereas model M6 assumes that it is constant. It is evident from the results of our analysis using model M7 (see bottom half of Table 2) that
S varies considerably across domains. Most strikingly, the silent substitution rate is at least an order of magnitude higher for the DNA-binding domain vs. the others. Because silent substitutions (by definition) do not affect the amino acid sequence, the potential functional significance of such changes is limited. Possible effects of silent substitutions might be an increase, or reduction, of the rate of translation, for example, if the relative abundance of tRNAs specific for each alternative codon varies. Although such a mechanism is a reasonable explanation for codon usage bias within a gene as a whole, it is not a likely explanation for the variation we observe in silent rates of substitution among functional domains within the p53 gene.
Another possible effect of codon usage bias is on translational accuracy. Selection for translational accuracy might cause codon usage bias, and therefore silent substitution rates, to vary across functional domains. There is some evidence for such effects in Drosophila. ![]()
![]()
![]()
![]()
M,
N, and
, while under model M7, which allows the silent rate to vary among domains, the DNA-binding domain retains the highest rate of missense substitution but now has one of the lowest rates of nonsense substitution (Table 2). Also under M7, the oligomerization domain has a rate,
N, which is roughly eight times higher than that of the DNA-binding domain. Moreover, under model M6 the estimated values of
N and/or
M for several domains are <1, implying that silent mutations are more likely to cause cancer than are missense or nonsense mutations, which is not reasonable. Under model M7 all relative substitution rates are >1. Another potential concern is that genes with multiple substitutions that violate our model assumptions will be ascertained into the sample because partial sequencing has revealed only one of the substitutions. If explicit information about the screening procedures used in each study were available, it might be possible to modify the model to correct for this potential source of bias.
A final concern is that "investigator sampling bias" may be enhanced by p53 germline polymorphisms in the general population. In our analysis, we treated the "reference" germline p53 sequence as fixed. In reality, p53 nucleotide polymorphisms exist in the human population that could influence whether a tumor is included in our analysis (e.g., has a single-nucleotide substitution) or excluded (e.g., has two, or more, nucleotide substitutions). More detailed models (and more detailed information) regarding the tumor sampling (and sequencing) process are needed to fully address such issues.
In contrast with the nonsense and missense rates relative to the silent rate, the nonsense/missense rate ratio is effectively independent of the investigator sampling bias. The variance of the estimated
N/
M ratio for each domain is influenced by investigator sampling bias (because this sampling bias reduces the sample size for some domains and not others) but the estimates are not biased by this effect (compare estimates of
N/
M between models M6 and M7 in Table 2). If we consider the ratio
N/
M, the DNA-binding domain displays a constant ratio of 1.5 under either model M6 or M7; this is dramatically lower than that for all other domains, apart from the transactivation domain (ratio of
0.7). The largest ratio is observed for the oligomerization domain (ranging from 17.5 to 17.9, depending on which model is used).
The striking differences that we observe in the rates of nonsense vs. missense substitutions among domains have a direct biological interpretation: the structural regions (linkers and proline-rich and oligomerization domains) may be largely unaffected by missense mutations because the precise residues found in such regions are often unimportant for p53 function; the specific residues of the DNA-binding and transactivation domains, on the other hand, may have a more important effect on function, and missense or nonsense substitutions in these domains thus contribute nearly equally to tumor development.
The low estimated rates of missense substitutions for the transactivation and oligomerization domains are likely due to the nonspecific nature of those domains. Studies suggest that a single-point mutation in those domains is generally not able to completely abolish the protein function (![]()
![]()
![]()
![]()
Estimates of
CG suggest that the rate of mutation at CG dinucleotides is more than fourfold the rate at non-CG sites. Estimates of parameter
PY, on the other hand, are close to one, indicating only a slight increase of the mutation rates at dipyrimidine sites. The estimated transition/transversion ratio,
, varies from 1.8 to 4.4 under model M7, which is within the range of values observed in evolutionary studies. The values of
are biased downward when variation in silent substitition rates is not accounted for (i.e., compare estimates of
under models M6 and M7 in Table 2).
The results of our analyses using model M8, which allows parameters to vary across primary tumor types, as well as functional domains, are shown in Table 3 and Table 4. First, we consider the substitution process; there is considerable variation in
N/
M among tumors, but some domains show much greater variation than others. Results are shown in Table 3 for only four of the six domains because too few observations were available to reliably estimate
N/
M for the transactivation and regulatory domains using the partitioned datasets. The least variation of
N/
M across tumor tissues is observed for the DNA-binding domain, with the ratio varying from a low of 0.43 (in brain cancers) to a high of 2.7 (in skin cancers). The most variation of
N/
M across tumor tissues is observed for the linkers with the ratio varying from a low of 1.9 (in stomach cancers) to a high of 109.3 (in rectal cancers). There are also some clear trends across tumor types: bladder and brain cancers appear to have the lowest average
N/
M ratio (averaged across domains) and lung, rectum, and skin cancers have the highest. These differences in substitution rates are very pronounced and it is likely that they are indicators of fundamental underlying differences in the biological role of p53 in cancer initiation and progression in these different tissues.
We also studied the mutation process in different tumor types by examining estimates of
CG,
PY, and
(Table 3). Parameter
CG varies widely among tumor types with brain, colon, stomach, and rectum having the highest values (ranging from 6.3 to 10.8) and bladder, liver, lung, and skin having the lowest values (ranging from 2.6 to 3.5). This is likely a reflection of the influence of exogenous vs. endogenous mutagenic influences in the different organs. Mutations in p53 from bladder, liver, lung, and skin may be more heavily influenced by exogenous factors, while mutations from brain, colon, stomach, and rectum may be most heavily influenced by endogenous factors such as primary sequence. The dipyrimidine mutation rate parameter,
PY, is much less variable among primary tumor types and is quite close to 1 in most cases (ranging from a low of 0.32 in liver to a high of 1.4 in brain). This suggests that there is little difference in mutation rates as a consequence of a dipyrimidine in the primary sequence. Because at least one mechanism of dipyrimidine mutation (conversion of CC to TT by UV; ![]()
Table 4 shows the variation of the transition/transversion rate ratio,
, across tumor types and across functional domains. The average value of
is highest for the DNA-binding domain and lowest for the transactivation domain. These results may be biased, however, because we have not corrected for investigator sampling bias (variation of
S across domains) in this analysis. More reliable is the variation of the average
values across primary tumor types. The highest average value of
is observed for tumors of the brain, colon, and ovary. The lowest is observed for tumors of the bladder, liver, and lung. Once again, this is likely to reflect differences in exogenous vs. endogenous mutational influences: the most pervasive endogenous factor influencing rates of mutation is the presence of CpG sites; this increases the rates of transitions vs. transversions whereas many exogenous mutagens have the opposite effect.
| DISCUSSION |
|---|
Large-scale databases that compile the frequencies of somatic mutations at particular nucleotides of cancer genes from tumors are an important new resource for studying the role of somatic mutation in cancer development and progression. In this article, we have developed a general parametric framework aimed at modeling the spectrum of mutations in cancer genes and facilitating estimation of biologically relevant parameters. It is shown (by examining the p53 mutation database) that an important parameter to consider is the relative rate of substitution of nonsense vs. missense mutations (i.e., the ratio of nonsense to missense substitution rates),
N/
M, in different functional domains and primary cancer types. A ratio close to 1 was observed for the DNA-binding domain, indicating that missense and nonsense mutations were about equally likely to produce cancer in this domain. The remaining domains, which are primarily involved in protein structure, displayed ratios >>1 (100-fold greater in some tumor types), indicating that these domains can tolerate a much higher level of missense mutation without producing cancer. A codon-based model, such as we have developed, is needed to extract this information because it depends critically on the probabilities that particular codons produce missense or nonsense changes. The overall frequency of missense mutations is much higher in all domains (![]()
Another finding in our analysis of the p53 mutation database is that estimates of
CG, the parameter that describes the effect of CG dinucleotides on mutation rates, vary greatly among primary cancer types; this likely reflects the differing importance of endogenous and exogenous factors on the mutational spectrum in these organs. Similarly, the ratio of the transition rate to the transversion rate,
, varies dramatically across domains and across primary tumor types; this is also likely to reflect an underlying heterogeneity of the mutation process in different organs, at least partially due to differing environmental influences. Both the effects of environment (on the spectrum of mutations) and the influence of selection acting on cells carrying particular mutations (on the spectrum of substitutions) can be detected using our models. Selection acting during the substitution process likely accounts for much of the variation in substitution rates among domains, and endogenous and exogenous factors influencing the mutation process likely account for much of the variation among tumor types. It is important to try to tease apart the effects of these different influences.
By examining the relative rates of silent substitution,
S, among domains we find strong evidence supporting the conjecture of ![]()
Another intriguing possibility is that the boundaries of functional domains might be initially identified, or further refined, by examining the spectrum of somatic mutations. One could choose the gene boundaries as part of a Bayesian or maximum-likelihood analysis. Our analyses suggest that it is very important to partition analyses according to both tumor type and functional domain. However, this greatly reduces the power of the analyses because there may be only a few hundred observations in each tumor type category vs. thousands of observations in the database as a whole. The general model that we have developed can be extended to account for additional complexities; given that all the models we considered provided a highly significant improvement in the fit of the model to the p53 mutation database it is very likely that yet more complex models can be proposed that will further improve the fit to the data. Our models should be viewed as only an initial step toward the development of a realistic parametric framework for modeling the spectrum of mutations in cancer genes.
The program oncSpectrum, written in the C language, implements maximum-likelihood estimation of parameters for all the models described in this article. It is intended for use with data from a cancer mutation database such as the p53 database. The program can be downloaded from http://rannala.org.
| ACKNOWLEDGMENTS |
|---|
This research was supported by grants from the Alberta Heritage Foundation for Medical Research (AHFMR), the Peter Lougheed Foundation/Canadian Institutes of Health Research (MOP44064, PLS47851), and the National Institutes of Health (HG01988) to B. Rannala and by grants from the Biotechnology and Biological Sciences Research Council and Human Frontier Science Program to Z. Yang. AHFMR provided funds for Z. Yang to visit the University of Alberta in 2001.
Manuscript received March 27, 2002; Accepted for publication May 13, 2003.
| LITERATURE CITED |
|---|
AKAIKE, H., 1973 Information theory and an extension of maximum likelihood principle, pp. 267281 in 2nd International Symposium on Information Theory. Akademia Kiado, Budapest.
AKASHI, H., 1994 Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136:927-935.[Abstract]
BALINT, E. and K. H. VOUSDEN, 2001 Activation and activities of the p53 tumor suppressor protein. Br. J. Cancer 85:1813-1823.[Medline]
BENNETT, W. P., S. P. HUSSAIN, K. H. VAHAKANGAS, M. A. KHAN, and P. G. SHIELDS, 1999 Molecular epidemiology of human cancer risk: gene-environment interactions and p53 mutation spectrum in human lung cancer. J. Pathol. 187:8-18.[Medline]
BISHOP, J. M., 1991 Molecular themes in oncogenesis. Cell 64:235-248.[Medline]
BRASH, D. E., J. A. RUDOLPH, J. A. SIMON, A. LIN, and G. J. MCKENNA et al., 1991 A role for sunlight in skin cancer: UV-induced p53 mutations in squamous cell carcinoma. Proc. Natl. Acad. Sci. USA 88:10124-10128.
CHÉNE, P., 2001 The role of tetramerization in p53 function. Oncogene 20:2611-2617.[Medline]
COOPER, D. N. and H. YOUSSOUFIAN, 1988 The CpG dinucleotide and human genetic disease. Hum. Genet. 87:151-155.
DELEO, A. B., G. JAY, E. APPELLA, G. C. DUBOIS, and L. W. LAW et al., 1979 Detection of a transformation-related antigen in chemically induced sarcomas and other transformed cells of the mouse. Proc. Natl. Acad. Sci. USA 76:2420-2424.
HANAHAN, D. and R. A. WEINBERG, 2000 The hallmarks of cancer. Cell 100:57-70.[Medline]
HOLLSTEIN, M., K. RICE, M. S. GREENBLATT, T. SOUSSI, and R. FUCHS et al., 1994 Database of p53 gene somatic mutations in human tumors and cell lines. Nucleic Acids Res. 22:3551-3555.
HUSSAIN, S. P. and C. C. HARRIS, 1999 p53 mutation spectrum and load: the generation of hypotheses linking the exposure of endogenous or exogenous carcinogens to human cancer. Mutat. Res. 428:23-32.[Medline]
JEFFREY, P. D., S. GORINA, and N. P. PAVLETICH, 1995 Crystal structure of the tetramerization domain of the p53 tumor suppressor at 1.7 angstroms. Science 267:1498-1502.
JIANG, M., T. AXE, R. HOLGATE, C. P. RUBBI, and A. L. OKOROKOV et al., 2001 p53 binds the nuclear matrix in normal cells: binding involves the proline-rich domain of p53 and increases following genotoxic stress. Oncogene 20:5449-5458.[Medline]
LAIRD, P. W. and R. JAENISCH, 1996 The role of DNA methylation in cancer genetics and epigenetics. Annu. Rev. Genet. 30:441-464.[Medline]
LANE, D. P. and L. V. CRAWFORD, 1979 T antigen is bound to a host protein in sv40-transformed cells. Nature 278:261-263.[Medline]
LEVINE, A. J., 1997 p53, the cellular gatekeeper for growth and division. Cell 88:323-331.[Medline]
LEVINE, A. J., M. C. WU, A. CHANG, A. SILVER, and E. A. ATTIYEH et al., 1995 The spectrum of mutations at the p53 locus. Ann. NY Acad. Sci. 768:111-128.[Medline]
LIN, J., J. CHEN, B. ELENBAAS, and A. J. LEVINE, 1994 Several hydrophobic amino acids in the p53 amino-terminal domain are required for transcriptional activation, binding to mdm-2 and the adenovirus 5 E1B 55-kD protein. Genes Dev. 8:1235-1246.
LINZER, D. I. and A. J. LEVINE, 1979 Characterization of a 54K dalton cellular SV40 tumor antigen present in SV40-transformed cells and uninfected embryonal carcinoma cells. Cell 17:43-52.[Medline]
MAY, P. and E. MAY, 1999 Twenty years of p53 research: structural and functional aspects of the p53 protein. Oncogene 18:7621-7636.[Medline]
OZOREN, N., and W. S. EL-DEIRY, 2000 Introduction to cancer genes and growth control, pp. 343 in DNA Alterations in Cancer: Genetics and Epigenetic Changes, edited by M. EHRLICH. Eaton Pressing, Natick, MA.
PIETENPOL, J. A., T. TOKINO, S. THIAGALINGAM, W. S. EL-DEIRY, and K. W. KINZLER et al., 1994 Sequence-specific transcriptional activation is essential for growth suppression by p53. Proc. Natl. Acad. Sci. USA 91:1998-2002.
PONDER, B. A., 2001 Cancer genetics. Nature 411:336-341.[Medline]
RODIN, S. N. and A. S. RODIN, 2000 Human lung cancer and p53: the interplay between mutagenesis and selection. Proc. Natl. Acad. Sci. USA 97:12244-12249.
ROEMER, K., 1999 Mutant p53: gain-of-function oncoproteins and wild-type p53 inactivators. Biol. Chem. 380:879-887.[Medline]
SLATKIN, M. and B. RANNALA, 1997 The sampling distribution of disease-associated alleles. Genetics 147:1855-1861.[Abstract]
SOUSSI, T. and C. BEROUD, 2001 Assessing TP53 status in human tumours to evaluate clinical outcome. Nat. Rev. Cancer 1:233-240.[Medline]
SOUSSI, T., C. CARON DE FROMENTEL, and P. MAY, 1990 Structural aspects of the p53 protein in relation to gene evolution. Oncogene 5:945-952.[Medline]
TAYLOR, H. M., and S. KARLIN, 1984 An Introduction to Stochastic Modeling. Academic Press, New York.
VENKITARAMAN, A. R., 2002 Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell 108:171-182.[Medline]
VOGELSTEIN, B., E. R. FEARON, S. R. HAMILTON, S. E. KERN, and A. C. PREISINGER et al., 1988 Genetic alterations during colorectal tumor development. N. Engl. J. Med. 319:525-532.[Abstract]
WALKER, D. R., J. P. BOND, R. E. TARONE, C. C. HARRIS, and W. MAKALOWSKI et al., 1999 Evolutionary conservation and somatic mutation hotspot maps of p53: correlation with p53 protein structural and functional features. Oncogene 18:211-218.[Medline]
WATERMAN, J. L., J. L. SHENK, and T. D. HALAZONETIS, 1995 The dihedral symmetry of the p53 tetramerization domain mandates a conformational switch upon DNA binding. EMBO J. 14:512-519.[Medline]
WEINBERG, R. A., 1995 The retinoblastoma protein and cell cycle control. Cell 81:323-330.[Medline]
YANG, Z., 2001 Adaptive molecular evolution, pp. 327348 in Handbook of Statistical Genetics, edited by D. J. BALDING, C. CANNINGS and M. BISHOP. John Wiley & Sons, New York.
This article has been cited by other articles:
![]() |
S. Ro and B. Rannala Inferring Somatic Mutation Rates Using the Stop-Enhanced Green Fluorescent Protein Mouse Genetics, September 1, 2007; 177(1): 9 - 16. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Greenman, R. Wooster, P. A. Futreal, M. R. Stratton, and D. F. Easton Statistical Analysis of Pathogenicity of Somatic Mutations in Cancer Genetics, August 1, 2006; 173(4): 2187 - 2198. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Mathe, M. Olivier, S. Kato, C. Ishioka, P. Hainaut, and S. V. Tavtigian Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods Nucleic Acids Res., March 6, 2006; 34(5): 1317 - 1325. [Abstract] [Full Text] [PDF] |
||||
- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Yang, Z.
- Articles by Rannala, B.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Yang, Z.
- Articles by Rannala, B.









