- THIS ARTICLE
-
Abstract
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Noguchi, Y.
- Articles by Ueshima, R.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Noguchi, Y.
- Articles by Ueshima, R.
The Mitochondrial Genome of the Brachiopod Laqueus rubellus
Yasuhiro Noguchia, Kazuyoshi Endob, Fumio Tajimac, and Rei Ueshimac,da Institute of Biological Sciences, University of Tsukuba, Tsukuba 305-0006, Japan,
b Geological Institute, University of Tokyo, Tokyo 113-0033, Japan,
c Department of Biological Sciences, University of Tokyo, Tokyo 113-0033, Japan
d PRESTO, Japan Science and Technology Corporation, Kawaguchi, Saitama 332-0012, Japan
Corresponding author: Kazuyoshi Endo, Geological Institute, University of Tokyo, 7-3-1 Hongo, Tokyo 113-0033, Japan., endo{at}geol.s.u-tokyo.ac.jp (E-mail)
Communicating editor: S. YOKOYAMA
| ABSTRACT |
|---|
The complete nucleotide sequence of the 14,017-bp mitochondrial (mt) genome of the articulate brachiopod Laqueus rubellus is presented. Being one of the smallest of known mt genomes, it has an extremely compact gene organization. While the same 13 polypeptides, two rRNAs, and 22 tRNAs are encoded as in most other animal mtDNAs, lengthy noncoding regions are absent, with the longest apparent intergenic sequence being 54 bp in length. Gene-end sequence overlaps are prevalent, and several stop codons are abbreviated. The genes are generally shorter, and three of the protein-coding genes are the shortest among known homologues. All of the tRNA genes indicate size reduction in either or both of the putative T
C and DHU arms compared with standard tRNAs. Possession of a TV (T
C arm-variable loop) replacement loop is inferred for tRNA(R) and tRNA(L-tag). The DHU arm appears to be unpaired not only in tRNA(S-tct) and tRNA(S-tga), but also in tRNA(C), tRNA(I), and tRNA(T), a novel condition. All the genes are encoded in the same DNA strand, which has a base composition rich in thymine and guanine. The genome has an overall gene arrangement drastically different from that of any other organisms so far reported, but contains several short segments, composed of 23 genes, which are found in other mt genomes. Combined cooccurrence of such gene assortments indicates that the Laqueus mt genome is similar to the annelid Lumbricus, the mollusc Katharina, and the octocoral Sarcophyton mt genomes, each with statistical significance. Widely accepted schemes of metazoan phylogeny suggest that the similarity with the octocoral could have arisen through a process of convergent evolution, while it appears likely that the similarities with the annelid and the mollusc reflect phylogenetic relationships.
THE genome organization of animal mitochondrial (mt) DNA has been studied to have insights into regulation mechanisms of the mitochondrial genetic system and to estimate evolutionary processes of the system itself or of the organisms that carry it. To date, complete nucleotide sequences for mt genomes have been reported for 87 species spreading over eight animal phyla (![]()
![]()
In addition to these more or less uniform properties, there are features that mtDNAs of many animals have in common, but there are considerable and rather systematic variations among higher-order taxa. Notable ones include relative gene order, modified genetic code, and variant structures of tRNAs and rRNAs among others (![]()
![]()
![]()
![]()
![]()
For example, chordates so far examined commonly share the same mt gene order, although minor variations exist (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Disregarding the tRNA genes, the positions of which are more variable than those of the other genes (![]()
![]()
![]()
![]()
![]()
![]()
![]()
The animal mt gene order, however, is not always conserved as exemplified above. The gene maps of cnidarians (![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
In this context, we initially hoped to find reliable phylogenetic hallmarks in the mt gene order of the brachiopod Laqueus rubellus, of which the complete DNA sequence is presented here. Brachiopods constitute one of the lophophorate phyla, which are of considerable importance in understanding patterns of animal evolution, having both deuterostome and protostome features, thus having occupied particularly controversial positions in schemes of metazoan phylogeny (cf. ![]()
![]()
| MATERIALS AND METHODS |
|---|
Specimens of L. rubellus (Sowerby) were collected by dredging from Sagami Bay, central Japan (35° 07.9' N; 139° 35.1' E; 8085 m in water depth), and kept in an aquarium until subsequent treatment. Total DNA was extracted from whole tissue of a single female individual using the CTAB method as described by ![]()
A partial sequence for the cox1 gene was determined first by amplifying the gene fragment using polymerase chain reaction (PCR) with "universal" primers (M. SAITO, S. KOJIMA and K. ENDO, unpublished results). Based on this sequence, a pair of primers was designed to amplify, by means of long-and-accurate PCR (LA-PCR; ![]()
The amplified product (ca. 14 kbp) was digested with HindIII (Takara), ligated into pUC18 vector using DNA ligation kit ver. 2 (Takara), and transformed into Escherichia coli (strain DH5
) according to the methods described by ![]()
![]()
Cloned fragments were sequenced using the ABI (Urayasu, Japan) Dye Primer CS+ or FS kit and an ABI Prism 373A automated sequencer. Contiguous sequences were assembled and the consensus sequence analyzed using GENETYX-MAC (Software Development, Tokyo, Japan). Protein and rRNA genes were identified by comparisons with corresponding known sequences of other metazoan taxa, including an annelid (L. terrestris; ![]()
![]()
![]()
![]()
![]()
| RESULTS AND DISCUSSION |
|---|
Genome organization:
The gene content and organization of Laqueus mtDNA are summarized in Table 1. The genome size (14,017 bp) represents one of the smallest of known metazoan mt genomes. Comparably small mt genomes include those of nematodes Caenorhabditis elegans (13,794 bp) and Ascaris suum (14,284 bp; ![]()
![]()
![]()
![]()
![]()
|
|
Laqueus mtDNA contains genes for 13 polypeptides [cox13, cob, atp6, atp8, nad16, and nad4L: for the standard genetic nomenclature of mitochondrial genes, see COMMISSION ON PLANT GENE NOMENCLATURE (1994); see also ![]()
![]()
![]()
![]()
![]()
![]()
The overall gene order of the Laqueus mt genome is unique, so that there is no simple solution to interconvert the Laqueus mt gene map to any known maps of other animals (Fig 1). However, there exist some local gene arrangements in Laqueus mtDNA that are shared with other mtDNAs, and those instances are summarized in Table 3. Disregarding the tRNA genes and noncoding sequences, the gene arrangement nad6-nad3-nad4L, where the genes are transcribed in this order, is shared with the octocoral cnidarian Sarcophyton glaucum mtDNA, which also has the arrangement rns-nad1 found in Laqueus mtDNA. The assortment of the three genes cob-atp6-nad5 is shared by Laqueus and the annelid L. terrestris mtDNAs. A couple of two-gene segments are shared with mtDNAs of the nematode Meloidogyne javanica (atp6-nad5; rns-nad1), C. elegans and A. suum (rns-nad1; nad4-cox1), and the bivalve mollusc Mytilus edulis (nad5-nad6; nad1-nad4). The gene arrangement atp6-nad5 is also found in the sea anemone Metridium senile mtDNA, and thus is shared by mt genomes of organisms from four phyla, i.e., a brachiopod, an annelid, a nematode, and a cnidarian. The segment nad3-nad4L is also found in human and other vertebrate mtDNAs and hence is shared by mt genomes of a brachiopod, a cnidarian, and chordates. The segment rns-nad1 is shared by mtDNAs of a brachiopod, a cnidarian, and nematodes.
|
|
Positions of tRNA genes in different mt genomes are much more variable than those of other genes (![]()
![]()
The sense strand of Laqueus mtDNA is 20.8% adenine, 15.2% cytosine, 26.5% guanine, and 37.6% thymine. The A + T content (58.4%) is within the range reported for other animal mt genomes, but is the lowest among invertebrate mt genomes (the A + T content ranges in chordates from 55.6 to 63.2%; echinoderms, 58.961.3%; molluscs, 59.870.7%; annelid, 61.6%; cnidarians, 62.564.5%; nematodes, 72.076.2%; and arthropods 77.484.9%). The G + T content of the sense strands (64.1%) is also within the reported range, but is closer to the higher end (70.2%; A. suum). The base composition in codon third positions (A, 17.7%; C, 9.4%; G, 30.2%; and T, 42.7%) clearly indicates a bias toward a high relative frequency of G + T, a condition that could be related to the unique mechanism of asymmetric replication in animal mtDNAs (![]()
![]()
Protein genes:
The genes for 13 polypeptides (cox13, cob, atp6, atp8, nad16, and nad4L) of Laqueus mtDNA were identified by comparison of the inferred amino acid sequence and size similarities to those of known homologues. Based on comparisons of nucleotide and amino acid sequences of the cox1 gene, brachiopod mtDNAs, including that of L. rubellus, have been inferred to employ the same modified mt genetic codes as in nematodes, arthropods, molluscs, and an annelid (M. SAITO, S. KOJIMA and K. ENDO, unpublished results); namely, AGA and AGG code for serine, TGA for tryptophan, and ATA for methionine. This inference is supported by the complete nucleotide sequence determined in this study.
Six protein-coding genes start with the orthodox translation initiation codon ATG, three genes (nad4L, atp8, nad2) with ATT, three (nad5, nad3, nad1) with GTG, and the remaining one with CTG (cox1). Among the genes for which the GTG translation initiation codon is inferred, nad3 could alternatively start with ATT immediately after the GTG codon, or with ATG four codons downstream. An in-frame ATG codon exists 9 codons and 16 codons downstream of the GTG codon of putative nad1 and nad5 genes, respectively, but if it is taken as the initiation site for each gene, then in each case, a segment containing what appear to be conserved amino acid residues in comparison with Lumbricus, Katharina, and Drosophila needs to be left out. A similar argument applies to the cox1 gene, which has an in-frame ATG codon five codons downstream of the inferred CTG start codon. Seven genes end in a complete termination codon, either TAG or TAA (Table 1). The remaining six genes, each of which is immediately followed by a tRNA gene, are inferred to terminate with an incomplete stop codon, T (for a review on unorthodox translation initiation and termination codons of metazoan mt protein genes, see ![]()
The proteins of the inferred lengths are generally shorter than most other previously described ones (Table 4), and three of them (Atp8, Cox2, Nad2) have either the same or a shorter size relative to the shortest known homologue. In Laqueus, Nad1, Nad2, Nad4L, and Nad6 are 9, 17, 12, and 11%, respectively, shorter than in Drosophila. Size differences for other proteins are within 5% variation in comparison with Drosophila, Katharina, and Lumbricus.
|
All codons are used in the 13 protein genes of Laqueus mtDNA (Table 5). In fourfold synonymous codon families, either T or G is the most frequently used nucleotide at the third codon position. In codon groups ending at purine and in those ending at pyrimidine, G and T are always used more frequently at the third codon position than A and C, respectively. In total, 72.9% of codons end at T or G. The most frequently used codon is TTT (Phe), followed by GTT (Val), GGG (Gly), and TTG (Leu), all of which consist exclusively of T and/or G (Table 5). On the other hand, the most frequently used amino acid in the 13 protein genes of Laqueus mtDNA is Leu (14.9%), followed by Val (12.1%), Ser (10.5%), Gly (9.5%), and Phe (8.0%). Among these amino acids, Val and Gly exhibit considerably higher relative frequency values compared with the ranges observed for other animal mt genomes (Val, 4.28.3%; Gly, 5.57.6%: ![]()
|
Comparisons of the percentage of amino acid identity between mt protein genes of Laqueus and those of human, sea urchin (S. purpuratus), Drosophila, Katharina, and Lumbricus generally indicate that Laqueus sequences are more similar to protostome homologues than to deuterostome ones (Table 4). The cox1 gene, the most conserved of all protein genes, of Laqueus shows almost the same extent of similarity to that of the five organisms compared. The cox3, cob, and cox2 genes, the next conserved ones, and atp6 indicate the highest similarities between Katharina and Laqueus. For the remaining eight genes, however, the highest identity is observed between Lumbricus and Laqueus.
Transfer RNA genes:
Twenty-two tRNA genes typical of animal mtDNAs have been identified in the Laqueus mt genome. The inferred Laqueus mt-tRNAs have a number of uniform features that are invariant in standard tRNAs, such as possession of a 7-bp amino-acyl arm, a 5-bp anticodon stem, and a 4-bp variable loop [except in tRNA(L-tag) and tRNA(R)]; the nucleotide preceding the anticodon is T, which is preceded by a pyrimidine [except in tRNA(H)]; and the nucleotide after the anticodon is a purine. But they exhibit some notable aberrancy (Fig 2).
|
In the putative Laqueus mt-tRNAs for Arg and Leu(tag), the T
C arm and variable loop are replaced by a single loop (TV replacement loop), as found in nematode mt-tRNAs (![]()
![]()
![]()
![]()
![]()
C arm and variable loop, the stem in the T
C arm is generally short, being 13 bp in length [except in tRNA(S-tct) and tRNA(I)].
In the Laqueus mt-tRNAs for Cys, Ile, Ser(tct), Ser(tga), and Thr, the DHU arm is replaced by a loop. That tRNA(S-tct) has an unpaired DHU arm is a typical feature of animal mtDNAs (![]()
![]()
![]()
![]()
![]()
The anticodons of Laqueus mt-tRNA genes are identical with those of the annelid Lumbricus (![]()
![]()
![]()
![]()
![]()
Ribosomal RNA genes:
The two mt-rRNA genes of Laqueus mtDNA, identified by comparisons of nucleotide similarities with those of other animals, are arranged side by side (rnl-rns in the encoded direction) in the mt genome without apparent coding sequences between them. In other known animal mt genomes, the two rRNA genes are intervened by at least one gene, which in many cases is trnV.
As is the case in many other mtDNAs, the precise boundaries of these genes remain uncertain. Fig 3 shows a comparison of the 5' and 3' regions with known homologues, for both genes aligned to the nucleotide sequence for the region containing the two rRNAs and the flanking upstream [trnL(tag)] and downstream (trnM) gene segments of Laqueus mtDNA. Close to the 3' end of an rns gene is an 18-bp conserved region that is followed by an inverted repeat (underlines in Fig 3) corresponding to the final stem and loop structure of Drosophila SrRNA (![]()
![]()
|
Comparisons of the 5' regions of rnl genes among various animals reveal that the first conserved region, 19 bp in length, is observed after 122, 132, and 178 nucleotides from the 5' end in Lumbricus, Katharina, and Drosophila rnl genes, respectively (Fig 4). In Laqueus, there are only 61 nucleotides separating this conserved region and the preceding sequence for trnL(tag), and we assume that the 5' region of the Laqueus rnl gene occupies all this available space.
|
The 3' end of the Laqueus rnl gene leaves us with considerable ambiguity. The lengths of the nucleotides after the last widely conserved sequence of 13 bp (TAGTACGAAAGGA in Laqueus) in the 3' regions of rnl genes in Lumbricus, Katharina, and Drosophila are 72, 29, and 54, respectively (Fig 3). Whereas, assuming that our interpretation of the 5' end of rns is correct, the available number of nucleotides after that conserved region and before the next gene (rns) is 95. The 3' end of Laqueus rnl appears to be at least downstream of the point corresponding to the 3' end of Drosophila rnl because there is a 9-bp conserved sequence (ATTAATATA) in this region that corresponds to the final stem and loop structure of Drosophila LrRNA (Fig 3; ![]()
Noncoding sequences:
The Laqueus mt genome is extremely compact. Out of the 37 gene boundaries of the genome, a total of 16 boundaries indicate sequence overlap, and gene pairs at 13 boundaries directly abut each other (Table 1). Among the remaining 8 boundaries, which accommodate a total of 79 unassigned nucleotides, only the 54-bp region between trnC and trnN contains a sequence longer than 10 bp.
The A + T content of the 54-bp region is 66.7%, which is higher than the average of the whole genome, but otherwise the region does not have typical features that are often found in the genome's largest noncoding region of other animal mtDNAs, such as a compelling potentiality to form a secondary structure, extensive polypurine and polypyrimidine tracts, certain conserved sequences, and direct repeat motifs (![]()
![]()
![]()
Genome features and their interrelationships:
The Laqueus mt genome exhibits some unusual features compared with other familiar animal mt genomes. Those include (1) small genome size, (2) absence of lengthy noncoding regions containing possible signals for transcription and replication, (3) truncated tRNA genes with aberrant inferred structures, (4) all the genes are encoded in the same DNA strand, and (5) absence of well-conserved gene arrangements compared to mt genomes of other phyla.
Since combinations of some of these features are also observed in mt genomes of other phyla, notably nematodes and molluscs (cf. Table 2), there exist grounds to suspect that at least some of them are interrelated with each other. The feature (1) is obviously not independent of (2) or (3), since the latter two directly contribute to the former, but their interdependence is not self-evident because small mt genomes do not always have the feature (2) as exemplified in nematodes C. elegans and A. suum (![]()
![]()
Out of the possible 10 combinations of the above-mentioned five features, none exhibits perfect bidirectional correlation among animal mt genomes compared (cf. Table 2). However, we note that a small genome of ~14 kbp in size always shows an unconserved gene order (the reverse does not hold as is evident in the case of Mytilus mtDNA). Correlation does not necessarily mean causal link, but it may be possible to argue that genome size reduction resulted in extensive reorganization of the genome. If this is the case, then there should exist at least another mechanism that has led to the considerable gene rearrangements in the moderately sized Mytilus mtDNA.
Statistical significance of shared gene boundaries:
Whatever the underlying mechanisms, it appears evident that much more gene rearrangement has taken place in the lineage leading to the analyzed Laqueus mtDNA than in other eucoelomate mtDNA lineages, such as those of arthropods, chordates, and echinoderms. However, as already described, there are several gene juxtapositions shared between Laqueus and other animal mtDNAs (Table 3). To assess statistical significance of these findings, we calculated probabilities that certain shared gene boundaries between different mt genomes arise purely by chance, on the basis of a similar reasoning applied to the issue of tRNA cluster conservation in echinoderm and other mtDNAs (![]()
First, the number of all the possible gene arrangements for a closed circular (cc) mtDNA containing a total of x genes is calculated. This may be given by linearizing the DNA at a given gene end (say 5' end), fixing the gene at either orientation (say 5' to 3'), then counting the number of permutations of the remaining genes, which is (x - 1)!, multiplied by 2x-1, to take into account that each remaining gene can take two different orientations (designated as case II). If we assume that the genes can be encoded in only one and the same DNA strand due to some constraints (designated as case I), then the equivalent number would simply be (x - 1)! The number of all the possible arrangements for mtDNA containing the usual set of 37 genes as in Laqueus would be 36! x 236, or 2.56 x 1052. Disregarding the 22 tRNA genes, the number of possible different ways to arrange the remaining 13 protein and two rRNA genes would be 1.43 x 1015. In case I, the equivalent numbers for the 37 genes and 15 genes become 3.72 x 1041 and 8.72 x 1010, respectively.
We then calculate the chance that a given segment composed of y genes would be found by chance in a ccDNA composed of x genes. In case II, this is given by the number of ways a segment of y genes may be inserted in either direction into the ccDNA composed of the remaining x-y genes [(x-y) x 2], multiplied by the number of possible gene orders in the ccDNA of x-y genes [(x-y - 1)! x 2x-y-1], divided by the total number of possible arrangements for the ccDNA containing x genes [(x - 1)! x 2x-1], which is equivalent to (x-y)!/[(x - 1)! x 2y-1]. In case I, the equivalent value would be (x-y)!/(x - 1)!
The random probability of finding a given 3-gene segment with a given gene order for the genome of 15 genes would thus be 12!/(14! x 22) = 0.00137 in case II and 12!/14! = 0.00549 in case I. For the genome of 16 genes excluding tRNA genes (as in the cnidarian S. glaucum, which has an extra protein gene in addition to the usual set of genes), the corresponding figure in case II would equal 13!/(15! x 22) = 0.00119. The random probability of finding a particular 3-gene segment in both a 15-gene genome (such as Laqueus) and a 16-gene genome (such as Sarcophyton) is therefore calculated as 0.00137 x 0.00119 = 1.63 x 10-6. Similarly, the probabilities that a particular 3-gene segment is found in both Laqueus and Lumbricus mtDNAs, both being 15-gene genomes (excluding tRNA genes) in which genes are encoded in the same DNA strand, are calculated as 0.00137 x 0.00137 = 1.88 x 10-6 in case II and 0.00549 x 0.00549 = 3.01 x 10-5 in case I.
On the other hand, the number of different kinds of arrangements that a y-gene segment in an x-gene genome (y < x) can take in case II is calculated as the number of permutations of y genes (xPy), multiplied by the number of combinations for the individual gene directions (2y), divided by the redundancy arisen by counting pairs of the same segments of different directions separately (2), which is equivalent to x! x 2y-1/(x-y)! The equivalent number in case I would be xPy, or x!/(x-y)!. Thus there are 15! x 22/12! = 10,920 (case II) and 15!/12! = 2730 (case I) kinds of 3-gene segments of different gene orders that can occur in a 15-gene genome. This leads to expectations that a total of 1.63 x 10-6 x 10,920 = 0.018 3-gene segments of the same gene orders is to be shared between Laqueus and Sarcophyton mtDNAs (excluding tRNA genes) and that a total of 1.88 x 10-6 x 10,920 = 0.021 (case II), or 3.01 x 10-5 x 2730 = 0.082 (case I), 3-gene segments of the same gene orders is to be shared between Laqueus and Lumbricus mtDNAs, solely by random associations.
If the total numbers of genes in the genomes are large enough compared with the size of the shared segments in question, the frequency of the shared segments of a certain size between two genomes arisen from random associations can be approximated by a Poisson distribution (see Appendix). From the expected number of shared segments between genomes (m) as calculated above, the Poisson probabilities [P(m, k) = mk x e-m/k!] for each number of occurrences (of shared segments of a certain size), from 0, 1, 2 to k times, can now be computed. The probability that at least one 3-gene segment of the same gene order is shared between Laqueus and Sarcophyton mtDNAs would then be calculated as 1 - 0.0180 x e-0.018/0! = 0.018. The results of similar calculations for other shared segments between Laqueus and other animal mtDNAs are summarized in Table 3.
The results concerning only protein and rRNA genes indicate that although each of the bivalve Mytilus and three nematode mtDNAs has two two-gene segments in common with Laqueus mtDNA, these conditions are within the range expected to arise from random processes (P = 0.090), and the probabilities become even less significant assuming the coding strand constraint (P = 0.263). Sharing of a single two-gene segment, each with human and Metridium mtDNAs, is not statistically significant either (P = 0.415). However, the combined cooccurrence of a two-gene and a three-gene segment in both Laqueus and the octocoral Sarcophyton mtDNAs is highly significant (P = 0.004). Also significant is the sharing of a three-gene segment between Laqueus and the annelid Lumbricus mtDNAs (P = 0.021), although the probability becomes out of the value normally considered as significant (i.e., P = 0.05) when we invoke the coding strand constraint (P = 0.079).
It therefore appears highly unlikely that these similarities with Sarcophyton and with Lumbricus (albeit being less significant) arose by chance. Similarities in gene orders are usually attributed to shared common ancestry. However, since it is difficult to envisage common ancestry among the diploblastic cnidarian Sarcophyton and the triploblastic coelomates Lumbricus and Laqueus, to the exclusion of other coelomates, such as arthropods, echinoderms, and chordates, it appears likely that the similarity with Sarcophyton, and quite possibly with Lumbricus as well, did not arise from recent shared common ancestry. Possibilities remain, however, that some ancient gene orders somehow survived in divergent taxa of different phyla and that observed similarities represent shared primitive characters of metazoan mt genomes. Otherwise, if not by chance nor by shared ancestry, the similarities may only be explained by convergent evolution.
The facts that a part of the shared three-gene segment with Sarcophyton (nad3-nad4L) is also shared by human mtDNA and that the two-gene segment shared with Sarcophyton (rns-nad1) is also shared by nematode mtDNAs, despite the improbable combinations of these taxa as an evolutionary unit, provide support for the interpretation that at least some of those shared gene arrangements arose by a process of convergent evolution. The Poisson probabilities for these segments held in common by three genomes are significant (P = 0.017 and 0.019, respectively). A similar argument can be made as to a part of the three-gene segment shared with Lumbricus mtDNA (atp6-nad5), which is shared by a total of at least four genomes, including the cnidarian Metridium and the nematode Meloidogyne, of which Poisson probability is also highly significant (P = 0.001; Table 3). There is, however, no convincing evidence to indicate functional advantages or constraints for certain local arrangements of protein and rRNA genes in mt genomes.
For the majority of comparisons considering all the genes including tRNA genes, the Poisson probabilities are insignificant. A notable exception is the comparison with Lumbricus mtDNA, which shares as many as five two-gene segments with Laqueus mtDNA, and the Poisson probability is highly significant even estimated underthe coding strand constraint (P = 2.0 x 10-4 in case II; P = 0.004 in case I: Table 3). Occurrence of three identical two-gene segments in Laqueus and the mollusc Katharina is also significant (P = 0.015; Table 3).
One each of the five segments shared by Laqueus and Lumbricus is also shared by Drosophila and Katharina, and these conditions in which each of the two-gene segments is shared by the three genomes are statistically significant (P = 0.014 for each comparison), but, unlike in the case with Sarcophyton, the association of Laqueus, Lumbricus, Drosophila, and Katharina does not necessarily conflict with at least some of the phylogenetic schemes proposed to date (cf. ![]()
![]()
![]()
![]()
![]()
![]()
![]()
Although it is not just shared characters but shared derived characters that count in phylogenetic inferences, the polarities for the shared gene assortments observed in this study are difficult to infer, because diploblastic animals, the only apparent candidates as outgroups, generally lack most of the tRNA genes on their mt genomes, hindering gene order comparisons including tRNA genes. However, considering the variable nature of tRNA gene positions even within various phyla, and the frequent gene rearrangements that the Laqueus mt genome is inferred to have experienced, it can be assumed that most, if not all, of the local gene arrangements of Laqueus shared with Lumbricus or with Katharina represent derived states among the gene arrangements of the coelomate animals compared. It thus appears safe to interpret that the brachiopod mtDNA is closer phylogenetically to the annelid and mollusc mtDNAs than to any known mtDNAs of other animal phyla.
|
|
|
| ACKNOWLEDGMENTS |
|---|
We are grateful to Bernard Cohen, Jeffrey Boore, and Kevin Helfenbein for helpful comments on the manuscript. This work was funded by grants from the Ministry of Education, Science, Sports, and Culture of Japan (K.E. and R.U.) and from Toray Science Foundation (R.U.).
Manuscript received November 3, 1999; Accepted for publication January 7, 2000.
| APPENDIX |
|---|
STATISTICAL TEST FOR RANDOM GENE ARRANGEMENTS
Consider two circular DNA sequences on which n genes are randomly distributed. We consider two types of random gene arrangements. One is the case where the direction of transcription is the same among all genes, and we call this case I. The other is the case where the direction of transcription for each gene is also random, and we call this case II.
When we compare two sequences, we can observe the number of shared arrangements with k genes (Sk). For example, when we have two sequences shown in Fig 4, we have S2 = 3 and S3 = 1. Note that when there is one shared arrangement with three genes, we treat this as two shared arrangements with two genes. We can also find the longest shared arrangement, and we denote the number of genes in this arrangement by Lmax. In this example the longest shared arrangement has three genes so that we have Lmax = 3.
The expected number of shared arrangements with k genes, E(Sk), can be obtained as follows. In case I, when one sequence has gene B next to gene A, the probability that the other sequence also has gene B next to gene A is 1/(n - 1). Since there are n genes, the expected number of shared arrangements with two genes is E(S2) = n/(n - 1). In the same way the probability of having a particular shared arrangement with three genes is 1/{(n - 1)(n - 2)}, so that the expected number of shared arrangements with three genes is E(S3) = n/{(n - 1)(n - 2)}. In general the expected number of shared arrangements with k genes is given by
![]() |
(A1) |
In case II the direction of transcription is random, so that we have
![]() |
(A2) |
It is difficult to obtain the distribution of Sk when n is large. It might be expected, however, that the distribution of Sk follows the Poisson distribution with mean E(Sk). To know whether the Poisson distribution is a good approximation or not, we conducted a computer simulation. In this simulation, using pseudorandom numbers, we generated a pair of random sequences 10,000,000 times and observed S2. The results are shown in Table 6. From this table we can see that the distribution of S2 approximately follows the Poisson distribution with mean E(S2) when n
10. This means that S2 can be used to test for random gene arrangement. For example, when n = 37 and S2 = 4, the probability of having S2
4 is 0.021 so that we can reject the hypothesis of random gene arrangement at the 5% level in case I.
Next, we consider the number of shared arrangements that have exactly k genes (Lk). In the example shown in Fig 4, we have L2 = 1 and L3 = 1. The expected number of arrangements that have exactly k genes, E(Lk), can be obtained as follows. We note Sn = nLn, which can occur when the gene arrangement is exactly the same between two sequences. Since the gene arrangement with exactly k + i genes contributes i + 1 times to Sk and since the completely identical gene arrangements between two sequences contribute n times to Sk, we have Sk = Lk + 2Lk+1 + 3Lk+2 + ... + (n - k)Ln-1 + nLn. Therefore, when k
n - 2, we have Lk = Sk - 2Sk+1 + Sk+2. From these results, E(Lk) can be given by
![]() |
(A3) |
Now, let us consider the longest shared arrangement. First, we note that when k
n/2 in case I and when k > n/2 in case II, the probability that the longest shared arrangement has k genes is the same as E(Lk). This is because the longest shared arrangement exists only once. When k < n/2 in case I and when k
n/2 in case II, more than one longest shared arrangement can occur. Such events, however, might be very rare when k
3, since E(L3) is quite small. Thus, when k
3, the probability that the longest shared arrangement has k genes is approximately given by
![]() |
(A4) |
To know the accuracy of Equation A4, we conducted a computer simulation. The method is the same as the previous one, and the results are shown in Table 7. We can see from this table that Equation A4 is quite accurate and can be used to test for random gene arrangement. For example, when n = 37 and Lmax = 3, we have Prob{Lmax
3} = 0.0072 so that we reject the hypothesis of random gene arrangement at the 1% level in case II.
Table 8 shows the critical values for Sk and Lmax, which can be used for the hypothesis testing.
| LITERATURE CITED |
|---|
ANDERSON, S., A. T. BANKIER, B. G. BARRELL, M. H. L. DE BRUIJN, and A. R. COULSON et al., 1981 Sequence and organization of the human mitochondrial genome. Nature 290:457-465[Medline].
ASAKAWA, S., Y. KUMAZAWA, T. ARAKI, H. HIMENO, and K. MIURA et al., 1991 Strand-specific nucleotide composition bias in echinoderm and vertebrate mitochondrial genomes. J. Mol. Evol. 32:511-520[Medline].
ASAKAWA, S., H. HIMENO, K. MIURA, and K. WATANABE, 1995 Nucleotide sequence and gene organization of the starfish Asterina pectinifera mitochondrial genome. Genetics 140:1047-1060[Abstract].
BARNES, W. M., 1994 PCR amplification of up to 35-kb DNA with high fidelity and high yield from
bacteriophage templates. Proc. Natl. Acad. Sci. USA 91:2216-2220
BEAGLEY, C. T., R. OKIMOTO, and D. R. WOLSTENHOLME, 1998 The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near-standard genetic code. Genetics 148:1091-1108
BEATON, M. J., A. J. ROGER, and T. CAVALIER-SMITH, 1998 Sequence analysis of the mitochondrial genome of Sarcophyton glaucum: conserved gene order among octocorals. J. Mol. Evol. 47:697-708[Medline].
BEARD, C. B., D. M. HAMM, and F. H. COLLINS, 1993 The mitochondrial genome of the mosquito Anopheles gambiae: DNA sequence, genome organization, and comparisons with mitochondrial sequences of other insects. Insect Mol. Biol. 2:103-124[Medline].
BOORE, J. L., 1999 Animal mitochondrial genomes. Nucleic Acids Res. 27:1767-1780
BOORE, J. L. and W. M. BROWN, 1994a Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata.. Genetics 138:423-443[Abstract].
BOORE, J. L. and W. M. BROWN, 1994b Mitochondrial genomes and the phylogeny of mollusks. Nautilus 108(Suppl. 2):61-78.
BOORE, J. L. and W. M. BROWN, 1995 Complete sequence of the mitochondrial DNA of the annelid worm Lumbricus terrestris.. Genetics 141:305-319[Abstract].
BOORE, J. L., T. M. COLLINS, D. STANTON, L. L. DACHLER, and W. M. BROWN, 1995 Deducing the pattern of arthropod phylogeny from mitochondrial DNA rearrangements. Nature 376:163-165[Medline].
BOORE, J. L., D. V. LAVROV, and W. M. BROWN, 1998 Gene translocation links insects and crustaceans. Nature 392:667-668[Medline].
BROWN, W. M., 1985 The mitochondrial genome of animals, pp. 95130 in Molecular Evolutionary Genetics, edited by R. J. MACINTYRE. Plenum Press, New York.
CANTATORE, P., M. ROBERTI, G. RAINALDI, M. N. GADALETA, and C. SACCONE, 1989 The complete nucleotide sequence, gene order and genetic code of the mitochondrial genome of Paracentrotus lividus.. J. Biol. Chem. 264:10965-10975
CLARY, D. O. and D. R. WOLSTENHOLME, 1985a The mitochondrial DNA molecule of Drosophila yakuba: nucleotide sequence, gene organization, and genetic code. J. Mol. Evol. 22:252-271[Medline].
CLARY, D. O. and D. R. WOLSTENHOLME, 1985b The ribosomal RNA genes of Drosophila mitochondrial DNA. Nucleic Acids Res. 31:4029-4044.
COHEN, B. L., and A. B. GAWTHROP, 1996 Brachiopod molecular phylogeny, pp. 7380 in Brachiopods, edited by P. COPPER and J. JIN. A. A. Balkema, Rotterdam, The Netherlands.
COHEN, B. L., and A. B. GAWTHROP, 1997 The brachiopod genome, pp. 189211 in Treatise on Invertebrate Paleontology, part H Brachiopoda (revised), edited by R. L. KAESLER. Geological Society of America and University of Kansas, Boulder, CO/Lawrence, KS.
COHEN, B. L., S. STARK, A. B. GAWTHROP, M. E. BURKE, and C. W. THAYER, 1998 Comparison of articulate brachiopod nuclear and mitochondrial gene trees leads to a clade-based redefinition of protostomes (Protostomozoa) and deuterostomes (Deuterostomozoa). Proc. R. Soc. Lond. Ser. B 265:1-8[Medline].
Mnemonic/Numeric.1: a database of plant-wide designations. (1994) Plant Mol. Biol. Rep. Suppl. 12:S81-S88.
CROZIER, R. H. and Y. C. CROZIER, 1993 The mitochondrial genome of the honeybee Apis mellifera: complete sequence and genome organization. Genetics 133:97-117[Abstract].
DESJARDINS, P. and R. MORAIS, 1990 Sequence and organization of the chicken mitochondrial genome: a novel gene order in higher vertebrates. J. Mol. Biol. 212:599-634[Medline].
FIELD, K. G., G. J. OLSEN, D. J. LANE, S. J. GIOVANNONI, and M. T. GHISELIN et al., 1988 Molecular phylogeny of the animal kingdom. Science 239:748-753
FISHER, C. and D. O. F. SKIBINSKI, 1990 Sex-biased mitochondrial DNA heteroplasmy in the marine mussel Mytilus.. Proc. R. Soc. Lond. Ser. B 242:149-156.
GHISELIN, M. T., 1988 The origin of molluscs in the light of molecular evidence, pp. 6695 in Oxford Surveys in Evolutionary Biology, Vol. 5, edited by P. H. HARVEY and L. PARTRIDGE. Oxford University Press, Oxford.
HALANYCH, K. M., J. D. BACHELLER, A. M. AGUINALDO, S. M. LIVA, and D. M. HILLIS et al., 1995 Evidence from 18S ribosomal DNA that the lophophorates are protostome animals. Science 267:1641-1643
HATZOGLOU, E., G. C. RODAKIS, and R. LECANIDOU, 1995 Complete sequence and gene organization of the mitochondrial genome of the land snail Albinaria coerulea.. Genetics 140:1353-1366[Abstract].
HENIKOFF, S., 1984 Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 29:351-359[Medline].
HOFFMANN, R. J., J. L. BOORE, and W. M. BROWN, 1992 A novel mitochondrial genome organization for the blue mussel, Mytilus edulis.. Genetics 131:397-412[Abstract].
JACOBS, H. T., D. J. ELLIOTT, V. B. MATH, and A. FARQUHARSON, 1988a Nucleotide sequence and gene organization of sea urchin mitochondrial DNA. J. Mol. Biol. 202:185-217[Medline].
JACOBS, H. T., P. BALFE, B. L. COHEN, A. FARQUHARSON and L. COMITO, 1988b Phylogenetic implications of genome rearrangement and sequence evolution in echinoderm mitochondrial DNA, pp. 121137 in Echinoderm Phylogeny and Evolutionary Biology, edited by C. R. C. PAUL and A. B. SMITH. Clarendon Press, Oxford.
JACOBS, H. T., S. ASAKAWA, T. ARAKI, K. MIURA, and M. J. SMITH et al., 1989 Conserved tRNA gene cluster in starfish mitochondrial DNA. Curr. Genet. 15:193-206[Medline].
JANKE, A., G. FELDMAIER-FUCHS, W. K. THOMAS, A. VON HAESELER, and S. PÄÄBO, 1994 The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137:243-256[Abstract].
KUMAZAWA, Y. and M. NISHIDA, 1995 Variations in mitochondrial tRNA gene organization of reptiles as phylogenetic markers. Mol. Biol. Evol. 12:759-772[Abstract].
LEE, W.-J. and T. D. KOCHER, 1995 Complete sequence of a sea lamprey (Petromyzon marinus) mitochondrial genome: early establishment of the vertebrate genome organization. Genetics 139:873-887[Abstract].
MITCHELL, S., A. COCKBURN, and J. SEAWRIGHT, 1993 The mitochondrial genome of Anopheles quadrimaculatus species A: complete nucleotide sequence and gene organization. Genome 36:1058-1073[Medline].
MORITZ, C., T. E. DOWLING, and W. M. BROWN, 1987 Evolution of animal mitochondrial DNA: relevance for population biology and systematics. Annu. Rev. Ecol. Syst. 18:269-292.
OKIMOTO, R., J. L. MACFARLANE, D. O. CLARY, and D. R. WOLSTENHOLME, 1991 Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification. Nucleic Acids Res. 19:1619-1626
OKIMOTO, R., J. L. MACFARLANE, and D. R. WOLSTENHOLME, 1992 The mitochondrial genomes of two nematodes, Caenorhabditis elegans and Ascaris suum.. Genetics 130:471-498[Abstract].
PÄÄBO, S., W. K. THOMAS, K. M. WHITFIELD, Y. KUMAZAWA, and A. C. WILSON, 1991 Rearrangements of mitochondrial transfer RNA genes in marsupials. J. Mol. Evol. 33:426-430[Medline].
PONT-KINGDON, G. A., N. A. OKADA, J. L. MACFARLANE, C. T. BEAGLEY, and C. D. WATKINS-SIMS et al., 1998 Mitochondrial DNA of the coral Sarcophyton glaucum contains a gene for a homologue of bacter








