Genetics. Published Articles Ahead of Print: June 18, 2008, Copyright © 2008
doi:10.1534/genetics.107.074450


A more recent version of this article appeared on July 1, 2008.


REGULAR RESEARCH PAPERS

Probabilistic cross-species inference of orthologous genomic regions created by whole-genome duplication in yeast

1 Trinity College

* To whom correspondence should be addressed. E-mail: khwolfe{at}tcd.ie.

Submitted on April 12, 2007
Revised on September 6, 2007
Accepted on 21 April 2008


Abstract

Identification of orthologous genes across species becomes challenging in the presence of a whole genome duplication (WGD). We present a probabilistic method for identifying orthologs that considers all possible orthology/paralogy assignments for a set of genomes with a shared WGD (here five yeast species). This approach allows us to estimate how confident we can be in the orthology assignments in each genomic region. Two inferences produced by this model are indicative of purifying selection acting to prevent duplicate gene loss. First, our model suggests that there are significant differences (up to a factor of seven) in duplicate gene half-life. Second, we observe differences between the genes that the model infers to have been lost soon after WGD and those lost more recently. Gene losses soon after WGD appear uncorrelated with gene expression level and knock-out fitness defect. However, later losses are biased towards genes whose paralogs have high expression and large knock-out fitness defects, as well as showing biases toward certain functional groups such as ribosomal proteins. We suggest that while duplicate copies of some genes may be lost neutrally after WGD, another set of genes may be initially preserved in duplicate by natural selection for reasons including dosage.

Key Words: evolutionary model, gene dosage, genome duplication