<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="EN">
<front>
<journal-meta><journal-id journal-id-type="publisher-id">plos</journal-id><journal-id journal-id-type="nlm-ta">PLoS Pathog</journal-id><journal-id journal-id-type="pmc">plospath</journal-id><!--===== Grouping journal title elements =====--><journal-title-group><journal-title>PLoS Pathogens</journal-title></journal-title-group><issn pub-type="ppub">1553-7366</issn><issn pub-type="epub">1553-7374</issn><publisher>
<publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc></publisher></journal-meta>
<article-meta><article-id pub-id-type="publisher-id">09-PLPA-RA-1897R3</article-id><article-id pub-id-type="doi">10.1371/journal.ppat.1000999</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="Discipline"><subject>Biochemistry/RNA Structure</subject><subject>Evolutionary Biology/Microbial Evolution and Genomics</subject><subject>Infectious Diseases/Protozoal Infections</subject><subject>Infectious Diseases/Sexually Transmitted Diseases</subject><subject>Microbiology/Parasitology</subject><subject>Molecular Biology/Molecular Evolution</subject></subj-group></article-categories><title-group><article-title>A Metazoan/Plant-like Capping Enzyme and Cap Modified Nucleotides in the Unicellular Eukaryote <italic>Trichomonas vaginalis</italic></article-title><alt-title alt-title-type="running-head"><italic>Trichomonas</italic> Capping and mRNA Cap</alt-title></title-group><contrib-group>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Simoes-Barbosa</surname><given-names>Augusto</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hirt</surname><given-names>Robert P.</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Johnson</surname><given-names>Patricia J.</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib>
</contrib-group><aff id="aff1"><label>1</label><addr-line>Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, Los Angeles, California, United States of America</addr-line>       </aff><aff id="aff2"><label>2</label><addr-line>Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom</addr-line>       </aff><contrib-group>
<contrib contrib-type="editor" xlink:type="simple"><name name-style="western"><surname>Tschudi</surname><given-names>Christian</given-names></name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"/></contrib>
</contrib-group><aff id="edit1">Yale School of Public Health, United States of America</aff><author-notes>
<corresp id="cor1">* E-mail: <email xlink:type="simple">johnsonp@ucla.edu</email></corresp>
<fn fn-type="con"><p>Conceived and designed the experiments: ASB RPH PJJ. Performed the experiments: ASB RPH. Analyzed the data: ASB RPH PJJ. Contributed reagents/materials/analysis tools: RPH. Wrote the paper: ASB RPH PJJ.</p></fn>
<fn fn-type="conflict"><p>The authors have declared that no competing interests exist.</p></fn></author-notes><pub-date pub-type="collection"><month>7</month><year>2010</year></pub-date><pub-date pub-type="epub"><day>15</day><month>7</month><year>2010</year></pub-date><volume>6</volume><issue>7</issue><elocation-id>e1000999</elocation-id><history>
<date date-type="received"><day>21</day><month>10</month><year>2009</year></date>
<date date-type="accepted"><day>11</day><month>6</month><year>2010</year></date>
</history><!--===== Grouping copyright info into permissions =====--><permissions><copyright-year>2010</copyright-year><copyright-holder>Simoes-Barbosa et al</copyright-holder><license><license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p></license></permissions><abstract>
<p>The cap structure of eukaryotic messenger RNAs is initially elaborated through three enzymatic reactions: hydrolysis of the 5′-triphosphate, transfer of guanosine through a 5′-5′ triphosphate linkage and N7-methylation of the guanine cap. Three distinctive enzymes catalyze each reaction in various microbial eukaryotes, whereas the first two enzymes are fused into a single polypeptide in metazoans and plants. In addition to the guanosine cap, adjacent nucleotides are 2′-<italic>O</italic>-ribose methylated in metazoa and plants, but not in yeast. Analyses of various cap structures have suggested a linear phylogenetic trend of complexity. These findings have led to a model in which plants and metazoa evolved a two-component capping apparatus and modification of adjacent nucleotides while many microbial eukaryotes maintained the three-component system and did not develop modification of adjacent nucleotides. Here, we have characterized a bifunctional capping enzyme in the divergent microbial eukaryote <italic>Trichomonas vaginalis</italic> using biochemical and phylogenetic analyses. This unicellular parasite was found to harbor a metazoan/plant-like capping apparatus that is represented by a two-domain polypeptide containing a C-terminus guanylyltransferase and a cysteinyl phosphatase triphosphatase, distinct from its counterpart in other microbial eukaryotes. In addition, <italic>T. vaginalis</italic> mRNAs contain a cap 1 structure represented by m<sup>7</sup>GpppAmpUp or m<sup>7</sup>GpppCmpUp; a feature typical of metazoan and plant mRNAs but absent in yeast mRNAs. Phylogenetic and biochemical analyses of the origin of the <italic>T. vaginalis</italic> capping enzyme suggests a complex evolutionary model where differential gene loss and/or acquisition occurred in the development of the RNA capping apparatus and cap modified nucleotides during eukaryote diversification.</p>
</abstract><abstract abstract-type="summary"><title>Author Summary</title>
<p>The protozoan parasite <italic>Trichomonas vaginalis</italic> is the cause of the most common non-viral sexually transmitted disease worldwide. Evolutionary analyses place <italic>Trichomonas</italic> in a super group called the Excavata, which includes the kinetoplastids and is highly divergent from fungi, metazoa and plants. Despite the vast evolutionary distances that separate these different eukaryotic lineages, a simplified view of eukaryotic evolution based on the complexity of nucleotide modifications at the 5′ end of mRNAs and the distribution of different types of enzymatic apparatus that confer these modifications has been proposed. Our analyses of the <italic>T. vaginalis</italic> capping enzyme challenges this view and provides the first example of a two-component capping apparatus typically found in metazoa and plants in a protozoan. The 5′-end nucleotide structure of <italic>T. vaginalis</italic> mRNAs is also shown to contain additional modified nucleotides, similar to that observed for metazoan and plant mRNAs and unlike that found in most eukaryotic microbes and fungi. Evolutionary analyses of the <italic>T. vaginalis</italic> capping enzyme indicates that this multicellular type capping apparatus may have come into existence earlier than previously thought.</p>
</abstract><funding-group><funding-statement>This work was supported by the National Institutes of Health R01 grant AI30537. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement></funding-group><counts><page-count count="13"/></counts></article-meta>
</front>
<body><sec id="s1">
<title>Introduction</title>
<p>The 5′ cap is a unique feature of eukaryotic messenger RNAs (mRNA) and eukaryotic viruses not found on eubacterial and archaeal RNAs <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. The addition of a m<sup>7</sup>G cap structure, or the cap 0 nucleotide, occurs co-transcriptionally via three consecutive reactions executed by the capping enzymatic apparatus: (i) hydrolysis of the 5′-triphosphate of nascent pre-mRNAs to a diphosphate by RNA 5′ triphosphatase (TPase), (ii) capping of the diphosphate end with GMP by the RNA guanylyltransferase (GTase) and (iii) methylation of the GpppN cap by RNA guanine-7- methyltransferase (MTase). The RNA cap is involved in multiple cellular functions including splicing, nucleocytoplasmic export, mRNA turnover and translation initiation <xref ref-type="bibr" rid="ppat.1000999-Cougot1">[2]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Varani1">[3]</xref>.</p>
<p>In addition to the N7-methylated guanosine cap, the 1st and 2nd adjacent nucleotides may be 2′O-ribose methylated after transcription <xref ref-type="bibr" rid="ppat.1000999-Reddy1">[4]</xref> forming cap 1 and cap 2 structures, respectively. In contrast to the cap 0, the role of cap 1 and cap 2 modified nucleotides is unclear. Their presence may reveal a phylogenetic trend with increasing levels of complexity in multicellular eukaryotes <xref ref-type="bibr" rid="ppat.1000999-Reddy1">[4]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Banerjee1">[5]</xref>. mRNAs in multiple species of yeast have been shown to contain only a cap 0 whereas the adjacent 5′ nucleotides of most multicellular eukaryotic mRNAs are further modified to form cap 1 and/or cap 2 structures <xref ref-type="bibr" rid="ppat.1000999-Reddy1">[4]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Furuichi1">[6]</xref>. Thus far trypanosomes are the only protists to have their mRNA cap structures examined and these were found to contain a hypermodified cap 4 structure m<sup>7</sup>Gpppm<sub>2</sub><sup>6</sup>AmpAmpCmpm<sup>3</sup>Um <xref ref-type="bibr" rid="ppat.1000999-Bangs1">[7]</xref>. The unconventional method of providing mRNAs with cap modification via <italic>trans</italic>-splicing of a splice leader RNA at their 5′ ends confers this unique cap structure to trypanosome mRNAs <xref ref-type="bibr" rid="ppat.1000999-Mair1">[8]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Tschudi1">[9]</xref>. Recently, enzymes involved in cap 4 formation have been identified <xref ref-type="bibr" rid="ppat.1000999-Arhin1">[10]</xref>–<xref ref-type="bibr" rid="ppat.1000999-Mittra1">[12]</xref>, although the role of these different cap nucleotide modifications remains to be elucidated <xref ref-type="bibr" rid="ppat.1000999-Arhin2">[11]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Zamudio1">[13]</xref>.</p>
<p>The N7-methylated guanosine cap (cap 0) is a structural consensus in all eukaryotic mRNAs unlike the different levels of modification of subsequent nucleotides observed in different organisms. The type and structural organization of the guanosine cap enzymatic apparatus that confers the cap 0 to the 5′end of mRNAs has been examined in a wide variety of eukaryotes <xref ref-type="bibr" rid="ppat.1000999-Hausmann1">[14]</xref>–<xref ref-type="bibr" rid="ppat.1000999-Sawaya1">[18]</xref>, including the divergent protists <italic>Giardia</italic> <xref ref-type="bibr" rid="ppat.1000999-Hausmann2">[19]</xref> and trypanosomes <xref ref-type="bibr" rid="ppat.1000999-Silva1">[20]</xref>–<xref ref-type="bibr" rid="ppat.1000999-Takagi1">[22]</xref>. An interesting evolutionary scenario has emerged <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>, which contradicts <xref ref-type="bibr" rid="ppat.1000999-Stiller1">[23]</xref> current models derived from multiple eukaryotic phylogenetic analyses <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>. This scenario dictates a multicellular (eg. metazoa and plants) versus a microbial eukaryotic pattern (e.g. fungi and microbial eukaryotes). The majority of investigated microbial eukaryotic species possess a three-component capping system (TPase, GTase and MTase) while metazoan and plants encode a two-component system with the fusion of TPase and GTase polypeptides and a separate MTase <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>.</p>
<p>In contrast to structural conservation of GTases, the TPases appear to have evolved from different protein ancestors <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. TPases found in multicellular eukaryotes contain the cysteinyl-phosphatase superfamily motif HCXXXXXR(S/T) named phosphate-binding loop or ‘p-loop’ (TPasePL). This enzyme catalyses a two-step phosphoryl transfer in which the conserved cysteine attacks the γ-phosphorus of the 5′-triphosphate on the nascent RNA to form a covalent protein-cysteinyl-S-phosphate intermediate producing a 5′-diphosphate RNA product <xref ref-type="bibr" rid="ppat.1000999-Changela1">[17]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Takagi2">[25]</xref>. The enzyme-bound phosphate is then hydrolyzed and liberated as inorganic phosphate. In contrast, all investigated microbial eukaryotic TPases to date share structural organization with metal-dependent phosphohydrolases (TPaseMDP) and have different structural configurations and enzymatic characteristics <xref ref-type="bibr" rid="ppat.1000999-Lima1">[26]</xref>. A current model of the mRNA capping system evolution in eukaryotes suggests that a ‘transitional state organism’ would contain both types of TPases <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. The fusion of the TPasePL-GTase was followed by a secondary loss or complete divergence of the TPaseMDP, and the last common ancestor (LCA) of plants and metazoans would carry the fused TPasePL-GTase version only. Although this model is supported by the distribution of these enzymes in several microbial eukaryotes <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>, and is consistent with eukaryote phylogenies based on few genes supporting a metazoan-plant relationship <xref ref-type="bibr" rid="ppat.1000999-Stiller1">[23]</xref>, it is incongruent with phylogenomic and multiple single gene phylogenies <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Minge1">[27]</xref>. However this mRNA capping centric view of eukaryote phylogeny does not preclude the occurrence of differential gene loss/gain during eukaryotic evolution.</p>
<p>Here, we have characterized the guanosine cap enzymatic apparatus of <italic>Trichomonas vaginalis</italic>, a divergent microbial eukaryote, that is a member of the Parabasalia and the super-group Excavata <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Adl1">[28]</xref>. In contrast to microbial eukaryotes, including other members of the Excavata, the diplomonad <italic>G. lamblia</italic> <xref ref-type="bibr" rid="ppat.1000999-Hausmann2">[19]</xref> and kinetoplastids <xref ref-type="bibr" rid="ppat.1000999-Silva1">[20]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Takagi1">[22]</xref> <italic>T. vaginalis</italic> has a dual TPasePL-GTase <underline>c</underline>apping <underline>e</underline>nzyme (TvCE) resembling the metazoan-plant type, as previously predicted from genome analyses <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref>. Moreover, we have demonstrated that <italic>T. vaginalis</italic> mRNAs contain a complex cap structure with a canonical m<sup>7</sup>G and adjacent modified nucleotides. Phylogenetic analyses of the GTase domain only and TPasePL-GTase alignments are consistent with a common origin of the <italic>T. vaginalis</italic> and metazoan-plant enzymes, which suggest that the TPasePL-GTase system is likely to be more ancient then previously thought and that complex scenarios of independent gene loss and/or gain events across various eukaryotic lineages may have taken place.</p>
</sec><sec id="s2">
<title>Results</title>
<p>We have demonstrated previously that <italic>T. vaginalis</italic> mRNAs have a 5′-end protection that can be removed by pyrophosphatase treatment, and that these mRNAs partially precipitate with a monoclonal anti-trimethylguanosine antibody <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa1">[30]</xref>. Therefore, similar to other eukaryotes this divergent protist must harbor an enzymatic capping apparatus. The <italic>T. vaginalis</italic> genome database (<ext-link ext-link-type="uri" xlink:href="http://www.trichdb.org" xlink:type="simple">www.trichdb.org</ext-link>) <xref ref-type="bibr" rid="ppat.1000999-Aurrecoechea1">[31]</xref> was screened using BLASTp analysis using available homologs for capping enzymes found in protists, yeast, plants and metazoans. As we previously reported based on analyses of the <italic>T. vaginalis</italic> genome <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref> the same putative <italic>T. vaginalis</italic> capping enzyme gene (locus tag TVAG_187730, RefSeq accession XP_001327945.1, named here TvCE) is identified in searches conducted using the GTase enzyme found in many microbial eukaryotes and those using the fused TPasePL-GTase from plants and metazoans. No genes were identified in searches using TPaseMDP sequences. Both TPasePL-GTase functional domains in TvCE were conserved relative to the human capping enzyme in protein domain analyses as those from other metazoans, plants, green algae and a choanoflagellate, the latter being a member of the Choanomonada: microbial eukaryotes closely related to metazoan <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Adl1">[28]</xref> (<xref ref-type="fig" rid="ppat-1000999-g001">Figure 1</xref> &amp; <xref ref-type="supplementary-material" rid="ppat.1000999.s002">Figure S1</xref>). TvCE shares 30% identity and 47% similarity to the human capping enzyme and also has its TPasePL-domain fused to the N-terminus of a GTase domain (<xref ref-type="supplementary-material" rid="ppat.1000999.s002">Figure S1</xref>). Strikingly, the TvCE cysteinyl-phosphatase superfamily motif HCXXXXXR(S/T) is 100% identical to the human sequence (<xref ref-type="fig" rid="ppat-1000999-g001">Figure 1A</xref>). Furthermore, TvCE contained the six conserved peptide motifs (I, III, IIIa, IV, V and VI) at the C-terminus that comprise the active site for GTP binding and nucleotidyl transfer of GTase <xref ref-type="bibr" rid="ppat.1000999-Wang1">[32]</xref> (<xref ref-type="fig" rid="ppat-1000999-g001">Figure 1B</xref>). Individual BLASTp searches with either the entire TvCE protein, the N-terminal domain encompassing the TPasePL domain (residues 1–258) or the two C-terminal domains (GTase, residues 259–441 and 442–561) identified by protein domain analyses, recovered as top hits animal sequences (see <xref ref-type="supplementary-material" rid="ppat.1000999.s001">Data S1</xref>). These data indicate that <italic>T. vaginalis</italic> may have a two-component capping system similar to metazoan and plants.</p>
<fig id="ppat-1000999-g001" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g001</object-id><label>Figure 1</label><caption>
<title>Sequence alignments of capping enzymes guanylyltransferase and triphosphatase.</title>
<p><italic>Homo sapiens</italic> (Hsa), <italic>Saccharomyces cerevisiae</italic> (Sce), <italic>Mus musculus</italic> (Mus), <italic>Drosophila melanogaster</italic> (Dme), <italic>Oryza sativa</italic> (Osa), <italic>Caenorhabditis elegans</italic> (Cel), <italic>Plasmodium falciparum</italic> (Pfa), <italic>Encephalitozoon cuniculi</italic> (Ecu), <italic>Crithidia fasciculata</italic> (Cfa), <italic>Trypanosoma brucei gambiense</italic> (Tbr), <italic>Giardia lamblia</italic> (Gla) and <italic>T. vaginalis</italic> (Tva) capping enzymes are shown. <bold>A.</bold> Sequence alignment of TvCE TPase reveals conservation with ‘metazoa/plant sequences and presence of the ‘P-loop’ active site. Identity and similarity are indicated in black and gray boxes, respectively. H125S and C126S substitutions in TvCE are indicated by arrowheads. <bold>B.</bold> GTase signature motifs conserved in all examined enzymes are indicated (I, III, IIIa, IV, V and VI). Numbers indicate the number of amino acid residues separating the motifs in different proteins. Essential amino acids for yeast (Sce) GTase are denoted by asterisks. The R526A substitution in TvCE is indicated by an arrowhead.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g001" xlink:type="simple"/></fig>
<p>To examine the enzymatic activity of recombinant TvCE we attempted to express either the full-length protein or each domain separately. We found that the single GTase domain was insoluble and that the TPase, although soluble, was inactive. Therefore, the full-length recombinant TvCE was used to analyse enzyme activity. The capping enzyme TPase specifically hydrolyses the γ-phosphate from the 5′-terminus of RNAs allowing its activity to be monitored by either the production of inorganic phosphate or ATP-ADP conversion. The TPase activity of recombinant TvCE was tested in the absence of metals at varying pH and the protein was found to be active in pH ranging from 4.5 to 6.5 (<xref ref-type="fig" rid="ppat-1000999-g002">Figure 2A</xref>). Addition of 1–5 mM MgCl<sub>2</sub> inhibited TvCE TPase activity and addition of EDTA could reverse this inhibition (<xref ref-type="fig" rid="ppat-1000999-g002">Figure 2B</xref>). Activity in an acidic pH range and inhibition by MgCl<sub>2</sub> are typical of a cysteinyl-phosphatase TPase and are not observed in classic microbial eukaryotic TPases <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. In mammalian cysteinyl-phosphatase TPase, a transient phosphocysteine-enzyme intermediate can be trapped using a short incubation time at low temperature and acidic pH <xref ref-type="bibr" rid="ppat.1000999-Changela1">[17]</xref>. We asked whether TvCE also exhibits this property and found that the phospho-TvCE intermediate was detected in the presence of [γ<sup>32</sup>P]ATP, but not in the presence of [α<sup>32</sup>P]ATP as predicted, and within a restricted acidic pH (3.5–4.0) (<xref ref-type="fig" rid="ppat-1000999-g002">Figure 2C</xref>). Phospholabeling of TvCE was lost after treatment with iodine but not hydroxylamine, supporting the presence of a thiophosphate linkage predicted to be at the C126 within the ‘P-loop’ (<xref ref-type="fig" rid="ppat-1000999-g002">Figure 2D, left panel</xref>). To determine whether C126 is involved in this linkage, TvCE was subjected to specific amino acid mutations and tested for the ability to form a thiophosphate linkage. As shown in <xref ref-type="fig" rid="ppat-1000999-g002">Figure 2D</xref> (right panel), serine substitution of either C126 or H125 abrogates formation of phospho-TvCE. When enzymatic affinity was compared across substrate concentration, we observed that P-loop mutants (H125S and C126S) and the double mutant (C126S and R526A) were inert to release γ-phosphate from ATP (<xref ref-type="fig" rid="ppat-1000999-g003">Figure 3A</xref>), which is in agreement with the inability of these mutants to form a phosphocysteine-enzyme intermediate (<xref ref-type="fig" rid="ppat-1000999-g002">Figure 2D</xref>). However, a mutation in the GTase domain alone (R526A) did not significantly affect TPase activity (<xref ref-type="fig" rid="ppat-1000999-g003">Figure 3A</xref>). The utilization of ATP and GTP by TvCE as a function of nucleotide concentration is similar (<xref ref-type="fig" rid="ppat-1000999-g003">Figure 3B</xref>). A relatively low conversion of the substrate is observed which may indicate that TvCE has either a low turnover rate or is partially inactivated during the 2-step purification. Together, these data indicate that C126 and H125 coordinate the cleavage and release of the γ-phosphate and demonstrate that TvCE has characteristics of a typical cysteinyl-phosphatase TPase <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>.</p>
<fig id="ppat-1000999-g002" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g002</object-id><label>Figure 2</label><caption>
<title>Triphosphatase activity of TvCE.</title>
<p><bold>A.</bold> Activity at various pHs. <bold>B.</bold> Effect of MgCl<sub>2</sub> and EDTA on activity. Products were resolved on TLC plates and detected by autoradiography. Arrows denote the origin of loading, substrate ([γ-<sup>32</sup>P]ATP) and product (<sup>32</sup>Pi). <bold>C.</bold> Detection of the TvCE-cysteinyl-S-phosphate formation. Labeling with [γ-<sup>32</sup>P]ATP (left panel) and [α-<sup>32</sup>P]ATP (right panel) compared from pH 3.0–5.0. <sup>32</sup>P-labeled TvCE detected by SDS-PAGE and autoradiography and TvCE protein loading controls flanked by pre-stained molecular weight markers (PageRuler prestained protein ladder, Fermentas) are shown in top and bottom panels, respectively. <bold>D.</bold> Effect of iodine and NH<sub>2</sub>OH on thiophosphate linkage formation (left panels) and TvCE-cysteinyl-S-phosphate formation of TvCE wild-type (WT) and mutant (H125S; C126S; R526A; C126S &amp; R526A) proteins (right panels). <sup>32</sup>P-labeled TvCE detected by SDS-PAGE and autoradiography (top panels) and TvCE protein loading controls (bottom panels) are shown. The purified protein is observed as a single band of the predicted ∼71 kDa molecular mass in both C and D.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g002" xlink:type="simple"/></fig><fig id="ppat-1000999-g003" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g003</object-id><label>Figure 3</label><caption>
<title>Triphosphatase activity of recombinant TvCE.</title>
<p><bold>A.</bold> Enzymatic activity of wild-type (WT) (filled circle) and mutants [R526A (filled square), H125S (filled triangle), C126S (open triangleD), C126S/R526S (open square)] was compared across a range of [γ-<sup>32</sup>P]ATP concentration. <bold>B.</bold> Enzymatic activity of wild-type TvCE was measured across a range of [γ-<sup>32</sup>P]ATP (filled circle) and [γ-<sup>32</sup>P]GTP (filled square) concentration for comparison of relative substrate affinity. Products were resolved on TLC plates and quantified by liquid scintillation counting.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g003" xlink:type="simple"/></fig>
<p>The capping enzyme GTase uses a ‘ping-pong’ reaction mechanism for nucleotidyl transfer through a covalent enzyme-(lysyl-<italic>N</italic>)-GMP intermediate <xref ref-type="bibr" rid="ppat.1000999-Shuman2">[33]</xref>. This allows GTase activity to be detected by <sup>32</sup>P transfer from [α-<sup>32</sup>P]GTP to the enzyme <xref ref-type="bibr" rid="ppat.1000999-Hausmann1">[14]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Ho1">[16]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Sawaya1">[18]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Schwer1">[34]</xref>. To determine whether TvCE also uses this reaction mechanism, we incubated recombinant TvCE with [α-<sup>32</sup>P]GTP in the presence or absence of divalent cations. The formation of the SDS-stable <sup>32</sup>P-labeled enzyme was then evaluated by SDS-PAGE (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4</xref>). TvCE GTase activity was detected by [α-<sup>32</sup>P]GTP labeling in a broad pH window (data not shown) and required the presence of either MnCl<sub>2</sub> or MgCl<sub>2</sub>. Calcium did not support activity (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4A</xref>). Metal dependence, specifically Mn<sup>2+</sup> and Mg<sup>2+</sup>, is a typical characteristic of an RNA capping GTase.</p>
<fig id="ppat-1000999-g004" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g004</object-id><label>Figure 4</label><caption>
<title>Guanylyltransferase activity recombinant TvCE.</title>
<p><bold>A.</bold> Effect of divalent cations Mg<sup>2+</sup> (filled circle), Mn<sup>2+</sup> (open square) and Ca<sup>2+</sup> (filled triangle). <bold>B.</bold> Effect of competition with cold nucleotides ATP (open square), CTP (filled triangle), UTP (filled square) and GTP (filled circle). <bold>C.</bold> Enzymatic activity of wild-type (WT) (filled circle) and mutants [R526A (open square), H125S (filled triangle), C126S (filled square), C126S/R526S (open square)] was compared across a range of [α-<sup>32</sup>P]GTP concentration. Products were resolved on SDS-PAGE and quantified by liquid scintillation.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g004" xlink:type="simple"/></fig>
<p>To determine the specificity of TvCE GTase, competition reactions were performed using increasing concentrations of cold NTPs. The recombinant TvCE GTase displayed an absolute specificity for a GTP substrate since ATP, CTP or UTP were not found to inhibit this enzyme (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4B</xref>). Guanylation of TvCE is dependent on nucleotide concentration (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4C</xref>). In accordance with previous reported structure/function studies <xref ref-type="bibr" rid="ppat.1000999-Sawaya1">[18]</xref>, the single arginine substitution at GTase motif VI (R526A) abolished TvCE activity (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4C</xref>). Interestingly, we also found that mutations in the TPase domain (H125S and C126S) significantly affected GTase activity, suggesting that these residues may exert a <italic>cis</italic>-structural effect on the GTase domain (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4C</xref>). Using size exclusion chromatography, we found both TPase and GTase activities in a single discrete peak corresponding to ∼71 kDa indicating that TvCE is monomeric (<xref ref-type="supplementary-material" rid="ppat.1000999.s003">Figure S2</xref>).</p>
<p>In addition to using nucleotide substrates to assess the activity of TvCE we have also evaluated its ability to transfer GMP from GTP to the 5′ triphosphate end of an <italic>in vitro</italic> transcribed RNA, using the full-length enzyme with both active domains. As shown in <xref ref-type="fig" rid="ppat-1000999-g005">Figure 5</xref>, TvCE is capable of labeling an RNA substrate provided this molecule harbors a 5′ triphosphate (lane 2). When the substrate is dephosphorylated by alkaline phosphatase digestion prior to incubation with TvCE, no transfer of [α-<sup>32</sup>P]GTP to the substrate is achieved (lane 3). In order for the GTase to transfer [α-<sup>32</sup>P]GMP from [α-<sup>32</sup>P]GTP to the 5′-end of the RNA substrate, the γ-phosphate must be first removed by the TPase (lane 4). When TPase or GTase activity is dependent on a TvCE mutant that is inactive in the corresponding domain, no labeling of the substrate is observed (lanes 5 &amp; 8). On the other hand, when TPase or GTase activity is dependent on a TvCE mutant that inactivates the opposite domain, labeling of the substrate is achieved (lanes 6 &amp; 7). These data are consistent with those characterizing individual domains using nucleotide substrates and TvCE mutants (<xref ref-type="fig" rid="ppat-1000999-g003">Figure 3A</xref> and <xref ref-type="fig" rid="ppat-1000999-g004">4C</xref>) and demonstrate that capping of an RNA substrate strictly depends on the activity of both domains (<xref ref-type="fig" rid="ppat-1000999-g005">Figure 5</xref>).</p>
<fig id="ppat-1000999-g005" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g005</object-id><label>Figure 5</label><caption>
<title>RNA capping activity of recombinant TvCE.</title>
<p>Capping activity was detected in a two-step reaction, monitoring both TPase and GTase activities, using [α-<sup>32</sup>P]GTP and a 352 nt RNA substrate. Wild-type TvCE (WT), a TPase-inactive mutant (C126S) and a GTase-inactive mutant (R526S) were tested in the two-step reaction using various combinations as indicated in the black filled squares (top panel). A negative control reaction using the RNA substrate dephosphorylated by alkaline phosphatase prior to capping was included as indicated (AP). The final product was split equally and loaded on two polyacrylamide gels that were subsequently analyzed by autoradiography (middle panel) or ethidium bromide staining (bottom panel). Arrows denote molecular weights of radiolabeled RNA markers (Century Marker Plus, Ambion) and the RNA substrate.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g005" xlink:type="simple"/></fig>
<p>The observed similarities between metazoa and <italic>T. vaginalis</italic> RNA capping enzymatic activities prompted us to investigate the mRNA cap structure in this organism. The nucleotides adjacent to the m<sup>7</sup>G-cap structure are methylated to different extents in eukaryotes. Metazoans can have 2′-<italic>O</italic>-ribose methylations of the first and second transcribed nucleotides forming the cap structure m<sup>7</sup>GpppNmpNmpNp, where the first transcribed nucleotide is an adenosine. However, yeast is found to have no modifications beyond the 5′ m<sup>7</sup>G cap <xref ref-type="bibr" rid="ppat.1000999-Reddy1">[4]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Furuichi1">[6]</xref>. We developed a multi-step protocol to purify mRNA suitable for structural analysis of nucleotides (see <xref ref-type="sec" rid="s4">Materials and Methods</xref>). None of the steps alone, including consecutive passages over poly-dT chromatography columns, was sufficient to remove abundant RNA species such as tRNA and rRNA (<xref ref-type="supplementary-material" rid="ppat.1000999.s004">Figure S3</xref>). As these contaminants contain hypermodified nucleotides in relatively high abundance that interfered with our analysis, we found it necessary to also immunoprecipitate RNAs with anti-TMG (which cross-reacts with the 7-methyl guanosine cap of mRNAs) to remove uncapped RNA contaminants (<xref ref-type="supplementary-material" rid="ppat.1000999.s004">Figure S3</xref>). As a result, <italic>in vivo</italic> <sup>32</sup>P-labeled mRNA obtained by this purification protocol was shown to be free of contaminating rRNA and tRNA by gel electrophoresis (data not shown) and structural analysis of nucleotides after P1 digestion (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6A</xref> and <xref ref-type="supplementary-material" rid="ppat.1000999.s004">Figure S3</xref>). The abundance of adenosine and uridine, relative to guanosine and cytosine, in this heterogeneous mRNA population (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6A</xref>) is consistent with the reported ∼65% AT content of <italic>T. vaginalis</italic> genes <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref>. Digestion of the purified mRNA sample with Tobacco Acid Pyrophosphatase (TAP), which specifically hydrolyzes the phosphoric acid anhydride bonds in the triphosphate bridge of a 5′-end cap structure, released m<sup>7</sup>GMP (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6A</xref>). The identity of this modified nucleotide was confirmed by demonstrating that it can be converted to m<sup>2,2,7</sup>GMP using <italic>Schizosaccharomyces pombe</italic> trimethylguanosine synthase or SpTgs (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6A</xref>) <xref ref-type="bibr" rid="ppat.1000999-Hausmann3">[35]</xref>, <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>.</p>
<fig id="ppat-1000999-g006" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g006</object-id><label>Figure 6</label><caption>
<title>Detection of cap 0 and cap 1 modified nucleotides on <italic>T. vaginalis</italic> mRNA by 2D-TLC analysis.</title>
<p><bold>A.</bold> Detection of cap 0 modified nucleotide using <italic>in vivo</italic> <sup>32</sup>P-labeled mRNA. The predicted migration of unmodified ribonucleotides (A, C, G and U) are shown in all panels for reference and left panel also denote modified ribonucleotide standards migration (dotted ovals). <bold>P1</bold> = nuclease P1 digestion; <bold>TAP</bold> = TAP treatment; <bold>SpTgs+TAP</bold> = SpTgs incubation prior to TAP treatment. <bold>B.</bold> Detection of cap 1-modified nucleotide. Unlabeled <italic>T. vaginalis</italic> mRNA either TAP treated (<bold>TAP</bold>) or mock-treated (<bold>TAP-mock</bold>) subjected to 5′-end labeling, nuclease P1 digestion and resolution on 2D-TLC.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g006" xlink:type="simple"/></fig>
<p>Next, a sample of unlabeled mRNA was used for analysis of the first transcribed nucleotide (position +1). RNAs were subjected to TAP or a TAP-mock treatment, alkaline phosphatase digestion and then 5′-end labeled using T4 polynucleotide kinase. These were then digested completely by nuclease P1, and nucleotides were resolved on 2D-TLC plates. Appearance of distinct spots presence only in the TAP treated sample should reveal whether the first transcribed mRNA nucleotide (position +1), which is protected by an m<sup>7</sup>G cap, is modified (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6B</xref>). Our result demonstrated that <italic>T. vaginalis</italic> mRNAs have a typical metazoan 2′-<italic>O</italic>-ribose methylated cap 1 nucleotide. The cap 1 nucleotide is either an adenosine (80%) or a cytosine (20%) based on a comparison of the Am/Cm ratio (<xref ref-type="fig" rid="ppat-1000999-g006">Figure 6B</xref>). This is in agreement with ∼75% of <italic>T. vaginalis</italic> protein-coding genes being preceded by the conserved initiator element (Inr) TC<underline>A</underline>T/<sub>C</sub>T/<sub>A</sub> that dictates transcription initiation at the underlined adenosine <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Liston1">[37]</xref>. The four unmodified nucleotides (A, C, G and U) observed in both samples likely result from partially degraded RNAs that would not require TAP treatment to be 5′-end labeled or the presence of intact RNA contaminants that lack a cap.</p>
<p>There is a strong bias for a uridine at positions +2 and +3 in most <italic>T. vaginalis</italic> mRNAs <xref ref-type="bibr" rid="ppat.1000999-Liston1">[37]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Liston2">[38]</xref>. To demonstrate whether the uridine +2 is 2′-<italic>O</italic>-ribose methylated forming a cap 2 structure, <italic>in vivo</italic> labeled mRNA was digested with RNase T2 prior to anti-TMG precipitation, the last step of the mRNA purification protocol (<xref ref-type="fig" rid="ppat-1000999-g007">Figure 7A</xref>). 2′-<italic>O</italic>-ribose methylation of this nucleotide would render the adjacent 3′-5′ phosphodiester linkage resistant to RNase T2 as this enzyme cleaves RNA via 2′-3′phosphate cyclization. These samples were then immunoprecipitated with anti-TMG agarose beads and 3′-5′ phosphodiester linkages were cleaved by on-bead RNase P1 treatment, to restrict analysis to the cap 2 modified nucleotide, if present, and its adjacent cap 1 nucleotide. Released nucleotides were then analyzed by 2D-TLC. If nucleotide +2 is not 2′-<italic>O</italic>-ribose methylated only the cap 1 unmodified nucleotide, mostly uridine, would be released, compared to the release of both 2′-<italic>O</italic>-ribose methylated nucleotides (position +2) and the unmodified nucleotide (position +3) if a cap 2 modified nucleotide is present (<xref ref-type="fig" rid="ppat-1000999-g007">Figure 7A</xref>). As a result, no modified nucleotides were detected indicating the absence of a cap 2 structure in <italic>T. vaginalis</italic> mRNAs (<xref ref-type="fig" rid="ppat-1000999-g007">Figure 7B</xref>). As predicted the most abundant unmodified nucleotide detected was uridine. Although these data cannot exclude the presence of a cap 2 structure in a small subset of mRNAs, if present these are highly underrepresented.</p>
<fig id="ppat-1000999-g007" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g007</object-id><label>Figure 7</label><caption>
<title>Evaluation of a cap 2 modified nucleotide in <italic>T. vaginalis</italic> mRNA by 2D-TLC analysis.</title>
<p><bold>A.</bold> Scheme of experimental approach. For simplicity, uridine is shown at position +2 and +3 as this is the most common nucleotide at these positions <xref ref-type="bibr" rid="ppat.1000999-Adl1">[28]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref>. The presence of cap 1 modified nucleotides Am or Cm (<xref ref-type="fig" rid="ppat-1000999-g004">Figure 4B</xref>) predict the release of 3′-phosphate mononucleotides (Np), and 2 of 4 possible cap structures depending on the absence or presence of a cap 2 modified uridine upon RNase T2 digestion. The presence of only a cap 1 structure (top) predicts release of m<sup>7</sup>GpppAmpUp or m<sup>7</sup>GpppCmpUp; if a cap 2 is present (bottom) m<sup>7</sup>GpppAmpUmpUp or m<sup>7</sup>GpppCmpUmpUp is predicted. Subsequent anti-TMG immunoprecipitation and elution by RNase P1 digestion predicts release of only unmodified ribonucleotides or both unmodified and 2-<italic>O</italic>-ribose modified ribonucleotides in the absence or presence of a cap 2 ribonucleotide, respectively. <bold>B.</bold> Detection of unmodified ribonucleotides at +2 of <italic>T. vaginalis</italic> mRNAs using the scheme shown in <bold>A.</bold> and 2D-TLC (right). Migration standards are shown in left panel.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g007" xlink:type="simple"/></fig>
<p>The unexpected finding of a metazoan/plant-like capping apparatus and a cap 1 modified nucleotide on <italic>T. vaginalis</italic> mRNAs led us to investigate the phylogeny of TvCE. A global taxa sampling of animal, fungal, plant, microbial and iridovirus GTases as well as a subset dataset including exclusively the TPasePL-GTase structural organization were aligned and subjected to protein maximum likelihood phylogenetic analyses (<xref ref-type="fig" rid="ppat-1000999-g008">Figure 8</xref> &amp; <xref ref-type="fig" rid="ppat-1000999-g009">Figure 9</xref>). Although generally poorly resolved, the global GTase phylogeny recovered the <italic>T. vaginalis</italic> sequence with modest support value (64% bootstrap proportion, BP but increasing to 70% and 77% when one or two of the most divergent sequences were removed) in a clan of exclusively TPasePL-GTase configured sequences, consistent with a common origin of their structural organization (<xref ref-type="fig" rid="ppat-1000999-g008">Figure 8</xref>). The TPasePL-GTase maximum likelihood tree revealed that TvCE does not cluster with proteins from animals (<xref ref-type="fig" rid="ppat-1000999-g009">Figure 9</xref>) and hence, at face value, does not support lateral gene transfer (LGT) between <italic>T. vaginalis</italic> or a Parabasalid ancestor and their animal hosts. These data are consistent with an ‘Unikonts’/‘Bikonts’ split <xref ref-type="bibr" rid="ppat.1000999-Roger1">[39]</xref> with on one hand plants and green algae forming a clan with <italic>T. vaginalis</italic> supported with 58% BP, raising to 77% in the absence of the iridovirus sequence (<xref ref-type="fig" rid="ppat-1000999-g009">Figure 9</xref>), and the metazoan and choanoflagellates forming the other clan in line with phylogenomic data <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>. As the position of the eukaryotic root is currently unknown <xref ref-type="bibr" rid="ppat.1000999-Roger1">[39]</xref> it is not clear how the iridovirus sequence relates to the eukaryotic sequences; its divergent sequence makes inference of its position in both analyses tentative. Consistent with potential long-branch attraction (LBA) issues for this sequence, its removal in the TPasePL-GTase alignment improved the BP of several nodes including the one for the <italic>T. vaginalis</italic>-plant/green algae clan and the metazoan clan (<xref ref-type="fig" rid="ppat-1000999-g009">Figure 9</xref>). Recoding the 20 amino acids into four categories, to allow optimization of the rate matrix and reduce composition heterogeneity between sequences and mitigate potential LBA artefacts, recovered a similar tree with reduced BP for the split between the TvCE-plant/algae clan and the metazoan clan in maximum likelihood analyses. Together our data favor a tree topology consistent with phylogenomics data <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref> and cluster TvCE with plant and green algae sequences; however, LGT from a metazoan donor to a parabasalid cannot be strictly rejected due to lack of strong signal.</p>
<fig id="ppat-1000999-g008" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g008</object-id><label>Figure 8</label><caption>
<title>Phylogenetic analysis of TvCE.</title>
<p>Protein maximum likelihood phylogeny recovered for 37 GTases functioning with TPaseMDP and TPasePL-GTases configured enzymes that maximizes taxonomic diversity. A clan of all TPasePL-GTases, that includes TvCE, is boxed and shaded. Numbers indicate the bootstrap proportion (BP) support values (&gt;50% are shown). Boxed values list BP for alternative analyses where the indicated sequences were removed; V&amp;G indicates removal of both the viral and <italic>Giardia</italic> sequences. The tree was rooted on <italic>Giardia lamblia</italic>, the most divergent sequence (broken branch to fit and true branch length value indicated). Removal of either one or the two most divergent sequences (<italic>Giardia</italic> and the iridovirus) did not affect the overall tree topology (see <xref ref-type="supplementary-material" rid="ppat.1000999.s001">Data S1</xref>; <xref ref-type="supplementary-material" rid="ppat.1000999.s005">Figure S4</xref>). Higher taxonomic terms <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Adl1">[28]</xref> are indicated. Species names abbreviated with the first three letters of the genus and species are all listed in full in <xref ref-type="supplementary-material" rid="ppat.1000999.s006">Table S1</xref> - with corresponding accession numbers. The scale bar indicates inferred number of substitutions per site.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g008" xlink:type="simple"/></fig><fig id="ppat-1000999-g009" position="float"><object-id pub-id-type="doi">10.1371/journal.ppat.1000999.g009</object-id><label>Figure 9</label><caption>
<title>Phylogenetic analyses to maximize taxonomic diversity.</title>
<p>Protein maximum likelihood phylogeny recovered for 23 TPasePL-GTases selected to maximize taxonomic diversity. The tree is rooted on the viral sequence and numbers indicate BP values &gt;50%. Boxed values indicate BP for corresponding analyses conducted without the iridovirus sequence (see <xref ref-type="supplementary-material" rid="ppat.1000999.s001">Data S1</xref>, <xref ref-type="supplementary-material" rid="ppat.1000999.s005">Figure S4</xref>). The scale bar indicates inferred number of substitutions per site.</p>
</caption><graphic mimetype="image" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.g009" xlink:type="simple"/></fig></sec><sec id="s3">
<title>Discussion</title>
<p>This is the first report of the functional characterization of a microbial eukaryote harboring a typical metazoan/plant-like capping apparatus containing a fused dual-functional TPasePL-GTase (TvCE). In addition to this unusual capping apparatus in <italic>T. vaginalis</italic>, mRNAs from this unicellular eukaryote were found to contain a cap 1 structure, another feature of metazoan mRNAs. The GTase domain of TvCE selectively binds GMP. The transfer of GMP to the 5′ triphosphate end of the RNA is strictly dependent on the prior removal of the γ-phosphate by the TPase activity of TvCE. The TPase domain of TvCE has typical features of a metazoan cysteinyl-phosphatase enzyme; it is active in the absence of metals as a bifunctional monomeric enzyme, prefers an acidic pH, and forms a phospho-enzyme through a cysteine-thiophosphate linkage within a conserved ‘P-loop’ active site which is necessary for phosphatase activity. An apparent <italic>cis</italic>-structural effect in TvCE was observed between the two active domains as a single amino acid change in the GTase domain results in a detectable reduction of TPase activity. We also observed that purification of the individual TvCE TPase domain resulted in an inactive enzyme. Together, these data suggest that the GTase domain can affect the activity of its neighboring TPase domain.</p>
<p>Analyses of the cap structure of mRNAs in this divergent microbial eukaryote revealed the presence of a canonical m<sup>7</sup>G cap 0 nucleotide, consistent with the presence of two conserved RNA (guanine-7) MTase genes in the <italic>T. vaginalis</italic> genome (<ext-link ext-link-type="uri" xlink:href="http://trichdb.org" xlink:type="simple">http://trichdb.org</ext-link>) that could convert the guanosine at the 5′ end of the mRNA by TvCE to an m<sup>7</sup>G. We previously showed that <italic>T. vaginalis</italic> has an atypical trimethylguanosine synthase (TgS) that produces m<sup>2,7</sup>G from m<sup>7</sup>G RNA substrates <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>. These observations led us to speculate that <italic>T. vaginalis</italic> mRNAs might contain m<sup>2,7</sup>G caps. Contrary to this prediction, the analyses presented here demonstrate the presence of a canonical m<sup>7</sup>G cap nucleotide on <italic>T. vaginalis</italic> mRNAs.</p>
<p>A general comparison of cap structures among different eukaryotes suggested that complexity increases following a phylogenetic trend across the evolution of eukaryotes <xref ref-type="bibr" rid="ppat.1000999-Banerjee1">[5]</xref>. This is illustrated by the presence of a complex cap 2 structure on mRNAs of metazoans but only cap 0 on budding yeast mRNAs. Prior to the studies reported here, the unusual cap 4 in trypanosomatids <xref ref-type="bibr" rid="ppat.1000999-Bangs1">[7]</xref>–<xref ref-type="bibr" rid="ppat.1000999-Arhin2">[11]</xref> appeared to be the exception to this rule. <italic>T. vaginalis</italic> mRNAs which contain a conserved cap 1 structure composed primarily of m<sup>7</sup>GpppAmpUp or m<sup>7</sup>GpppCmpUp are now also exceptions. No cap 2 structure was detected in <italic>T. vaginalis</italic> mRNA and a strong bias to uridines at position +2 was observed, making this cap structure less complex than metazoans. It should also be noted however, that the use of a heterogeneous population of mRNA may have obscured the detection of a cap 2 nucleotide in a smaller subpopulation of mRNAs. Similarly, although we consider it unlikely, we cannot strictly preclude that immunoprecipitation of mRNAs using the anti-TMG antibody led to the exclusion of a subset of mRNAs with a cap 2 nucleotide or an alternative cap structure that are not efficiently bound by this antibody.</p>
<p>The observation that ∼80% of cap 1 nucleotides in <italic>T. vaginalis</italic> mRNAs are adenosines (A) while ∼20% are cytosines (C) indicates that transcription initiation by RNA polymerase II in this organism can occur at cytosine, as well as adenosine. In turn this suggests the presence of either unknown variants of the initiator (Inr) motif that surrounds the start site of transcription of <italic>T. vaginalis</italic> mRNAs <xref ref-type="bibr" rid="ppat.1000999-Liston1">[37]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Liston2">[38]</xref> or unrelated motifs that can direct transcription to initiate at a cytosine. This is consistent with our previous prediction that only ∼75% of <italic>T. vaginalis</italic> genes appear to use a classic Inr leading to transcription initiation at an A <xref ref-type="bibr" rid="ppat.1000999-Carlton1">[29]</xref>.</p>
<p>The structural organization of the mRNA caps enzyme machinery in eukaryotes has been considered a marker for inference of eukaryote phylogeny <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. Fungi and most sampled microbial eukaryotes have separate TPaseMDP and GPase capping enzymes with the TPaseMDP structurally and mechanistically distinct from the TPasePL that is fused to a GTase in metazoan and plants. Thus, the acquisition of the metazoan-type dual-function enzyme after the divergence of unicellular and multicellular eukaryotes has been proposed <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. However recent genome samplings have broadened the taxonomic diversity of TPasePL-GTase encoding taxa, which now includes green algae, a choanoflagellate and <italic>T. vaginalis</italic>. These additional data and our structural, functional and phylogenetic analyses of TvCE complicates the earlier simple dichotomy observed between metazoan-plants and microbial eukaryotes. Indeed, the proponent of this dichotomy acknowledged that “The scheme is certainly oversimplified…” due to poor taxa sampling <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>.</p>
<p>Differential gene losses could explain the unusual presence of a metazoan/plant-like capping apparatus in <italic>T. vaginalis</italic>, green algae and choanoflagellates. The LCA of microbial eukaryotes, plants and animals may have contained the fused TPasePL-GTase and independent gene losses subsequently occurred in most currently sampled microbial eukaryotes, leaving <italic>T. vaginalis</italic>, green algae and choanoflagellates as rare microbial eukaryotes carrying this prototype. Notably the green algae and choanoflagellate TPasePL-GTase sequences were recovered in the expected clans as defined by phylogenomics <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, pushing the acquisition of this configuration among a likely microbial ancestor deeper in eukaryotic evolution. Alternatively, the LCA of plants/green algae and animals/choanoflagellates had the split system seen today in many microbial eukaryotes and subsequently lost this and acquired a new fused version. Similar but independent events would than be invoked for the acquisition of TvCE, leading to an enzyme with little sequence relatedness to its counterpart in plants and animals. However the global GTase phylogeny suggests that the GTase from the TPasePL-GTase fusion was shared by the LCA between <italic>T. vaginalis</italic>, plants/algae and choanoflagellate/animals; as expected if the gene was present in the LCA of all eukaryotes. Alternatively, the GTase for the taxa with TPasePL-GTase configurations was independently acquired by different lineages from similar sources with the same structural organization and/or <italic>T. vaginalis</italic> acquired TvCE by LGT. Existing phylogenies do not provide positive evidence for these scenarios. As Parabasalia and Diplomonads appear to be closely related within the excavates <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref> an independent origin scenario of the TPasePL-GTase fusion could have been suggested if the TvCE GTase formed a clan with the <italic>G. lamblia</italic> GTase, however this was not found in our maximum likelihood phylogenetic analyses (<xref ref-type="fig" rid="ppat-1000999-g008">Figure 8</xref>, <xref ref-type="supplementary-material" rid="ppat.1000999.s005">Figure S4</xref>). These data clearly reinforce the importance of further genome sampling among various microbial eukaryotes and viruses before evolutionary hypotheses based on the mRNA capping system can be appropriately assessed and contrasted to the current hypothesis of eukaryote phylogeny <xref ref-type="bibr" rid="ppat.1000999-Roger1">[39]</xref>. Phylogenomic data obtained so far consistently provide evidence for at least six major eukaryotic lineages <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Minge1">[27]</xref> that do not match those suggested by the distribution of the mRNA capping enzymes <xref ref-type="bibr" rid="ppat.1000999-Shuman1">[1]</xref>. Our phylogenetic analyses do not support LGT as the source of TvCE in <italic>T. vaginalis</italic> reducing the conflict between the distribution of the mRNA capping machinery and the six major lineages recovered by phylogenomics <xref ref-type="bibr" rid="ppat.1000999-Roger1">[39]</xref>. In contrast, relationships among microbial eukaryote GTases functioning with TPaseMDP seem more complex and in conflict with phylogenomic data in terms of major group relationships. For instance non-monophyly is observed for microsporidial GTases, which do not cluster with the Fungi, and the red algae cluster with the Amoebozoa. Could these be explained by LGT events as described for the acquisition of a trifunctional capping enzyme in a mimivirus, thought to be derived from its amoeba host <xref ref-type="bibr" rid="ppat.1000999-Benarroch1">[40]</xref>? Likewise the iridovirus TPasePL-GTase may have been acquired from an animal host. The evolution of capping enzymes in eukaryotes appears to have proceeded via multiple events that led to the independent loss and/or gain of genes in different lineages. Polarization of such events will require denser sampling of mRNA capping genes and additional robust independent phylogenetic analyses.</p>
</sec><sec id="s4" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="s4a">
<title>Expression and purification of recombinant TvCE</title>
<p>The ORF encoding TvCE (TVAG_187730) was cloned into <italic>Escherichia coli</italic> expression vector pET 200D- (Invitrogen), which adds a Histidine-tag at the N-terminus. TvCE was cloned either as a full length protein or as two separate domains. The full-length protein has a predicted molecular weight of 69 kDa (not including the histidine tag) and a pI of 6.9. A MUSCLE alignment of the protein predicted the TPase domain to reside between amino acids 1 and 254 and the GTase domain between amino acids 255 and 561. pET-200D constructs containing either the full-length or the TPase or GTase domain were transfected into <italic>E. coli</italic> strain BL21 as provided by the manufacturer (Invitrogen). 250 ml of bacteria cultures were grown to OD<sub>600</sub> 0.4–0.5 and 3% (v/v) ethanol and 0.2 mM IPTG was added to induce protein expression. Incubation was continued for 18–20h at 18°C, shaking at 180 rpm. Induced cells were centrifuged and resuspended in 50 mM NaPO4 pH 6.0, 1 mg/ml of lysozyme and 1× Halt protease inhibitor cocktail (ThermoScientific). Lysis was achieved by sonication on ice and cell debris was removed by spinning samples at 16,000× g for 30 min. The presence of soluble expressed recombinant protein was evaluated by SDS-PAGE before loading it on a pre-equilibrated 5 ml Mono-S column (GE Healthcare). The column was washed with 10 volumes loading buffer minus lysozyme. Proteins were eluted with a 5 ml step gradient of NaCl (100, 200, 300, 400 and 500 mM) in the same buffer. Eluted fractions shown to contain rTvCE by SDS-PAGE analysis were pooled together and imidazole was added to 40 mM. The proteins were further fractionated using Ni chromatography (HisTrap; GE Healthcare), as recommended by the manufacturer. After the two-step purification, the final purified rTvCE (50–150 ug of protein) was dialyzed against 50 mM Tris pH 7.4, 100 mM NaCl, 2 mM DTT and 10% glycerol. PCR mutagenesis was carried out as described <xref ref-type="bibr" rid="ppat.1000999-Zheng1">[41]</xref> and proteins were isolated using the 2-step purification scheme described above.</p>
</sec><sec id="s4b">
<title>TPase activity assays</title>
<p>To determine optimal pH of TvCE, 20 µl reactions containing 25 nM of full-length recombinant TvCE (flr-TvCE), 5 mM DTT, 160 nM [γ-<sup>32</sup>P]ATP were adjusted to 50 mM Tris-acetate (pH 7.0 and below) or 50 mM Tris-HCl (pH 7.5 and above). To evaluate metal dependence of TvCE TPase, 20 µl reactions containing 50 mM Tris-acetate pH 5.5, 350 nM of flr-TvCE, 5 mM DTT and 15 nM [γ-<sup>32</sup>P]ATP were performed in the presence of 0, 1, 2.5 and 5 mM MgCl<sub>2</sub>. These reactions were assays in either the presence of 0 or 20 mM EDTA for comparison. TvCE TPase nucleotide dependence was measured in 20 µl reactions containing 50 mM Tris-acetate pH 5.5, 5 mM DTT, a range of 2.5–160 nM [γ-<sup>32</sup>P]ATP or [γ-<sup>32</sup>P]GTP and 25 nM of flr-TvCE. For detection of phosphatase activity of TvCE, reactions were incubated at 37°C for 30 min, and <sup>32</sup>Pi was detected on TLC plates after autoradiography as described <xref ref-type="bibr" rid="ppat.1000999-Changela1">[17]</xref>. Products were sliced from the TLC plastic plates for quantification by liquid scintillation. A mock reaction (minus enzyme) was done in parallel to account for spontaneous radiolysis of the substrate. TvCE TPase was tested for the formation of a covalent protein-cysteinyl-S-phosphate intermediate <xref ref-type="bibr" rid="ppat.1000999-Changela1">[17]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Takagi2">[25]</xref>. To test the hypothesis that such an intermediate can be formed under acidic pH, a 10 µl reaction containing 50 mM Tris-acetate pH 3.0 to 5.0, 5 mM DTT, 160 nM [γ-<sup>32</sup>P]ATP and 25 nM of flr-TvCE was performed at 25°C for 15 seconds. One µl of this reaction was loaded on SDS-PAGE for Coomassie Blue staining and autoradiography. This analysis was carried out for flr-TvCE mutants, except that molar concentration of [γ-<sup>32</sup>P]ATP was increased to 330 nM, unincorporated nucleotides were removed by G-50 microcolumns (Amersham) and all sample contents were analyzed by SDS-PAGE. To confirm that TvCE is phosphor-labeled through formation of a thiophosphate linkage, chemical stability analysis was performed as previously described <xref ref-type="bibr" rid="ppat.1000999-Changela1">[17]</xref>. For this analysis, we compared thiophosphate linkage stability when phosphor-TvCE was treated with H<sub>2</sub>O, 100 mM NH<sub>2</sub>OH or 10 mM iodine.</p>
</sec><sec id="s4c">
<title>GTase activity assays</title>
<p>To evaluate metal specificity of TvCE GTPase, 10 µl reactions containing 50 mM Tris-acetate pH 7.0, 5 mM DTT, 20 nM [α-<sup>32</sup>P]GTP and 40 nM of full-length recombinant TvCE (flr-TvCE) were performed, varying the concentrations of either MgCl<sub>2</sub>, MnCl<sub>2</sub> or CaCl<sub>2</sub> from 0–10 mM. To evaluate substrate specificity of TvCE GTPase, a cold competition experiment was designed. 10 µl reactions containing 50 mM Tris-acetate pH 7.0, 5 mM DTT, 2.5 mM MgCl<sub>2</sub>, 100 nM [α-<sup>32</sup>P]GTP, 40 nM of recombinant TvCE were performed in the presence of 0–1 µM cold nucleotide competitor ATP, CTP, GTP or UTP. TvCE GTase dependence on nucleotide concentration was measured in 20 µl reactions containing 50 mM Tris-acetate pH 7.0, 5 mM DTT, a range of 0–16 nM [α-<sup>32</sup>P]GTP and 3 nM flr-TvCE. All reactions were incubated at 37°C for 30 min. and GTase activity was detected by the formation of the covalent enzyme-GMP intermediate <xref ref-type="bibr" rid="ppat.1000999-Sawaya1">[18]</xref>. The reaction product was detected on SDS-PAGE and autoradiography. The phosphor-labeled enzyme was sliced from gels and quantified by liquid scintillation.</p>
</sec><sec id="s4d">
<title>Measuring TvCE activity using a RNA substrate</title>
<p>A fragment containing 352 bp of <italic>T. vaginalis</italic> ß-tubulin was <italic>in vitro</italic> transcribed by T7 RNA polymerase and quantified as previously described <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>, and used as a substrate for the full length recombinant TvCE in a two-step reaction. As a negative control, one RNA sample was dephosphorylated by alkaline phosphatase treatment (Apex, Epicentre) prior to TvCE RNA capping activity and purified by phenol/chloroform extraction and ethanol precipitation. The first step of the reaction, the removal of the γ-phosphate, was tested by incubating 250 ng of the RNA in a 50 µl reaction containing 50 mM Tris-acetate pH 5.5, 5 mM DTT, and 200 nM recombinant TvCE at 37°C for 30 min. The RNA was then purified by phenol/chloroform extraction and ethanol precipitation. Next, the RNA was resuspended in a 50 µl reaction containing 50 mM Tris-acetate pH 7.0, 5 mM DTT, 2.5 mM MgCl<sub>2</sub>, 90 nM [α-<sup>32</sup>P]GTP and 200 nM recombinant TvCE and incubated at 37°C for 30 min. The RNA was then purified as described above, split into equal part and analyzed in 7% polyacrylamide gels under denaturing conditions (Tris-Borate EDTA buffer and 8M urea). One sample was stained by ethidium bromide and the other was dried and exposed to X-ray film.</p>
</sec><sec id="s4e">
<title>Cap structure analysis</title>
<p><italic>T. vaginalis</italic> strain T1, grown in TYM complete media <xref ref-type="bibr" rid="ppat.1000999-Clark1">[42]</xref> was subjected to <italic>in vivo</italic> labeling of RNAs. To achieve 12–18% total <sup>32</sup>P incorporation, 1–5×10<sup>8</sup> parasites were starved in the absence of phosphate-, serum-free DMEM and 1 mCi of phosphorus-32 radionuclide for 8–9 hours at 37°C. Cultures were mixed by inversion every hour during incubation. RNA was extracted and size-fractionated from <sup>32</sup>P-labeled and unlabeled <italic>T. vaginalis</italic> cultures using the <italic>mir</italic>Vana PARIS Kit (Ambion). To determine the nucleotide structure of the cap of <italic>T. vaginalis</italic> mRNAs, a protocol to obtain a population of mRNAs free of the hypermodified ribosomal and transfer RNAs was developed. The high relative abundance of hypermodified nucleotides present in these RNA species interfered with detection of modified nucleotides found specifically on mRNAs in the presence of minor contamination by rRNA or tRNA. First the large-size RNA fraction (&gt;200 nt) was isolated from cells to minimize contamination with small rRNA and tRNAs. The RNA is then passed by two consecutive rounds of poly dT purification (Promega). The eluted RNA was then concentrated by ethanol precipitation, followed by Terminator Exonuclease (Epicentre) digestion to degrade RNAs that contain a 5′-monophosphate (ex. rRNAs). The large-size RNAs were again isolated, and purified from partially degraded rRNAs and nucleotides. As a final step, the mRNAs were purified by immunoprecipitation using the mouse monoclonal anti-TMG (anti-trimethylguanosine agarose conjugate, CalBiochem) as described <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa1">[30]</xref>, taking advantage of the cross-reactivity of this antibody with the 7-methyl guanosine cap of mRNAs. The purity of the mRNA preparation protocol was assessed by electrophoresis and two-dimension thin layer chromatography (2D-TLC) analysis of <italic>in vivo</italic> labeled mRNAs after autoradiography. 2D-TLC was carried out using both combinations of organic solvents A, B and C, as previously described <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Grosjean1">[43]</xref>. To evaluate the presence of a 5′-end guanosine cap linked by a triphosphate bridge to the RNA, <italic>in vivo</italic> labeled mRNAs (∼10<sup>5</sup> cpm) were digested with TAP (Epicentre) and analyzed by 2D-TLC. To confirm the identity of a possible m<sup>7</sup>G cap structure in this highly purified fraction of <italic>T. vaginalis</italic> mRNA, the sample was treated with <italic>S. pombe</italic> Tgs prior to TAP treatment. This enzyme converts m<sup>7</sup>G to m<sup>2,2,7</sup>G which can resolved by 2D-TLC <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>. To identify a possible nucleotide modification at position +1, mRNA was 5′-end labeled. The mRNA was digested or mock-digested with TAP, dephosphorylated by Alkaline Phosphatase (APex, Epicentre) and heat-inactivated, and labeled by PNK with [γ-<sup>32</sup>P]ATP. Between enzymatic treatments, RNA was purified by phenol/chloroform extraction and ethanol precipitation. TAP-digested or mock-digested samples were comparatively analyzed by 2D-TLC. Spots were quantified by liquid scintillation. To identify a possible nucleotide modification at position +2, we utilized <italic>in vivo</italic> labeled mRNAs. The protocol necessary to obtain mRNA free of detectable hypermodified nucleotides from ribosomal and transfer RNA, as described above, was followed except that before anti-TMG immunoprecipitation the RNA sample was concentrated by ethanol precipitation and digested with RNase T2 in a 20 µl reaction volume. Six µl of this reaction was then anti-TMG immunoprecipitated in a 0.6 ml end volume. After washes, the RNase T2-digested m<sup>7</sup>G capped mRNAs bound to the anti-TMG agarose beads were mildly digested with nuclease P1 in a 25 µl reaction volume for 2 h at 37°C under agitation. The anti-TMG beads were then recentrifuged, the supernatant was recovered and P1-digestion continued at 50°C for 16 h to completion. Five µl of this reaction were loaded on 2D-TLC plates for analysis. For 2D-TLC comparative maps, radiolabeled m<sup>7</sup>G, m<sup>2,7</sup>G and m<sup>2,2,7</sup>G standards were produced as previously reported <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>. 2′-<italic>O</italic>-ribose methylated nucleotide standards were produced by P1-digestion of a AmCmGmUm oligomer (Sigma-Proligo), and all were compared to previous reported maps <xref ref-type="bibr" rid="ppat.1000999-SimoesBarbosa2">[36]</xref>, <xref ref-type="bibr" rid="ppat.1000999-Grosjean1">[43]</xref>.</p>
</sec><sec id="s4f">
<title>General bioinformatics and phylogenetics</title>
<p>Sequences were extracted from protein databases following BLASTp searches at NCBI using default settings except for the sequence of the red algae <italic>Cyanidioschyzon merolae</italic> GTase which was obtained from the KEGG database. Selected sequences aligned for comparison and phylogenetic inferences are described in <xref ref-type="supplementary-material" rid="ppat.1000999.s001">Data S1</xref>. Proteins structural organizations were investigated with SMART searching both SM and Pfam profiles <xref ref-type="bibr" rid="ppat.1000999-Letunic1">[44]</xref>. For phylogenetics reference protein sequences were chosen from BLAST hit lists to maximize taxa diversity and sequences aligned with ClustalW <xref ref-type="bibr" rid="ppat.1000999-Larkin1">[45]</xref>. SEAVIEW 4.0 <xref ref-type="bibr" rid="ppat.1000999-Galtier1">[46]</xref> was used to visually check the alignment features and sites used for phylogenetic analyses were selected with the mask option. Sites with more then five indels were deleted as where divergent sites. The best-fitting models for the protein alignments were identified with Prottest 2.2 <xref ref-type="bibr" rid="ppat.1000999-Abascal1">[47]</xref>, which was invariably the LG rate matrix <xref ref-type="bibr" rid="ppat.1000999-Le1">[48]</xref> with a gamma rate model “G” for across site rate variation. PhyML <xref ref-type="bibr" rid="ppat.1000999-Guindon1">[49]</xref> was used within SEAVIEW to perform maximum likelihood phylogenetic inferences. The alpha shape parameter of the gamma rate model (four categories) was estimated using the BioNJ distance tree used as the starting tree in conjunction of both NNI and TPR branch swapping moves for further optimization. The optimal alpha shape parameter for site rate variation was then fixed and used for bootstrap analyses (100 replicates). In order to mitigate potential issues of composition bias and long branch attractions, the protein alignment were recoded for the TPasePL-GTase protein alignment, as described <xref ref-type="bibr" rid="ppat.1000999-Hampl1">[24]</xref> with the 20 amino acids reduced to four categories implied by the JTT rate matrix ([A,N,G,T,P,S], [R,D,E,Q,K], [E,L,M,F,V] and [HWYC]). Following removal of invariant sites in the recoded alignment (410 sites were reduced to 316 sites), PhyML maximum likelihood analyses (GTR rate matrix with gamma model, both estimated) were performed and recovered similar trees as those based on the protein alignments. All alignments are available upon request.</p>
</sec></sec><sec id="s5">
<title>Supporting Information</title>
<supplementary-material id="ppat.1000999.s001" mimetype="application/msword" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s001" xlink:type="simple"><label>Data S1</label><caption>
<p>Bioinformatic data.</p>
<p>(0.03 MB DOC)</p>
</caption></supplementary-material><supplementary-material id="ppat.1000999.s002" mimetype="image/tiff" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s002" xlink:type="simple"><label>Figure S1</label><caption>
<p>Structural organization of TvCE. <bold>A.</bold> SMART (<ext-link ext-link-type="uri" xlink:href="http://smart.embl-heidelberg.de/" xlink:type="simple">http://smart.embl-heidelberg.de/</ext-link>) analyses identified the same three domains in TvCE found in other TPasePL-GTase proteins. The corresponding structural organization for the human sequence is shown for comparison. In the human sequence the pink polygon corresponds to a segment of low compositional complexity and vertical bars corresponds to intron positions. Values below bars indicate intron phase and values above the bars indicate the corresponding amino acid positions. Overlapping domains identified by SMART are not indicated. <bold>B.</bold> Position and e-values for the three identified domains (TPase: PTP_DSPc: GTase: mRNA cap enzyme; C: cap C domain) in the parasite and humans sequences with the most significant scores are tabulated for comparison.</p>
<p>(0.35 MB TIF)</p>
</caption></supplementary-material><supplementary-material id="ppat.1000999.s003" mimetype="image/tiff" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s003" xlink:type="simple"><label>Figure S2</label><caption>
<p>Determination of the molecular weight of the active form of TvCE. The recombinant TvCE was subjected to size exclusion chromatography in a pre-calibrated 90–10 cm Superdex™ 200 column (GE). Molecular weight (MW) markers (Sigma): Soybean trypsin inhibitor (20 kDa), chicken egg albumin (∼44 kDa), bovine serum albumin (66 kDa), alcohol dehydrogenase (∼150 kDa), apoferritin (443 kDa) and thyroglobulin (∼669 kDa). For MW curve standardization, excluded volume (Vo) was estimated by dextran blue, and elution volume (Ve) for each protein was collected and calculated. A 2 ml mix containing 0.5 mg of each protein, 50mM Tris pH 6.0; 200 mM NaCl, 5mM DTT; 10% glycerol was loaded into the column. Ninety five fractions (2 ml each) were collected at 0.2 ml/min. <bold>A.</bold> MW curve and TvCE activity. The linear regression of the MW curve (dotted line) and its equation are shown along with GTase (filled circle) and TPase (open circle) activity curves. Each MW marker is indicated by size (filled square). The activity peak and the first and last fractions tested for TvCE activity are indicated by a circled number which corresponds to the fraction number (fractions 30–50). The y axis indicates the molecular weight values (left) and GTase and TPase activities (right), and the x axis indicates Ve/Vo. <bold>B.</bold> Western blot analysis of TvCE after size fractionation. Five microliters of fractions 2 to 90, as indicated, were analyzed by western blot with HisProbeTM-HRP (Pierce). The peak fraction 42, and every other fraction from 30 (∼440 kDa) to 50 (∼50 kDa), denoted by circles, were examined for TvCE activity as described in <xref ref-type="sec" rid="s4">Methods</xref>.</p>
<p>(4.26 MB TIF)</p>
</caption></supplementary-material><supplementary-material id="ppat.1000999.s004" mimetype="image/tiff" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s004" xlink:type="simple"><label>Figure S3</label><caption>
<p>2D-TLC resolution of P1-digested mRNA purified from <italic>T. vaginalis</italic>. Left panel, <italic>in vivo</italic>-labeled mRNA was prepared as described in <xref ref-type="sec" rid="s4">Methods</xref> except that three rounds of oligo(dT) chromatography was performed but the final step of anti-TMG precipitation was omitted. Right panel, mRNA was prepared exactly as described in <xref ref-type="sec" rid="s4">Methods</xref>. The final mRNA sample from each approach was then precipitated with ethanol and digested with nuclease P1 to completion. A total of 10,000 cpm of the digested RNA was resolved by 2D-TLC, as decribed in <xref ref-type="sec" rid="s4">Methods</xref>, and resolved with solvents A and B (43) as indicated. Cold unmodified nucleotides (A, C, G and U) were loaded in each plate as denoted. As observed, even with one additional round of purification by oligo(dT) chromatography but without anti-TMG immunoprecipitation (left panel), the complete digestion of this mRNA sample produces the four unmodified nucleotides plus additional six detectable spots which seem to correspond to hypermodified nucleotides found in tRNAs as compared to 2D-TLC maps previously reported (43). However, when a final step of anti-TMG immunoprecipitation was included for the purification of mRNAs (right panel), the complete digestion of this sample resulted in the apperance of the four unmodified ribonucleotides only, the typical composition of mRNA molecules in a cell. Additionaly, the relative percentage of A/T (∼70%) is in accordance with A/T content of genes in this organism (29).</p>
<p>(1.17 MB TIF)</p>
</caption></supplementary-material><supplementary-material id="ppat.1000999.s005" mimetype="application/pdf" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s005" xlink:type="simple"><label>Figure S4</label><caption>
<p>Alternative protein phylogenetic analyses of TvCE. <bold>A.</bold> Global phylogeny of GTases with the same alignment used to generate phylogeny depicted in <xref ref-type="fig" rid="ppat-1000999-g008">Figure 8</xref> with the iridovirus sequence removed. The arrow indicates the branch leading to all TPasePL-GTase configured sequences. <bold>B.</bold> Global phylogeny of GTases with the same alignmnent used to generate phylogeny depicted in <xref ref-type="fig" rid="ppat-1000999-g008">Figure 8</xref> with the <italic>Giardia</italic> sequence removed. The arrow indicates the branch leading to all TPasePL-GTase configures sequences. <bold>C.</bold> Same as in A with the <italic>Giardia</italic> sequence further removed. <bold>D.</bold> Phylogeny of TPasePL-GTase configured sequences with the same alignment used to generate the phylogeny depicted in <xref ref-type="fig" rid="ppat-1000999-g009">Figure 9</xref> with the iridovirus removed. In all trees the species names are abbreviated with the first three letters of the genus and species name (full names are listed in supplementary <xref ref-type="supplementary-material" rid="ppat.1000999.s006">Table S1</xref>) and the LG model with G was used. Shown values are bootstrap proportions (%, 100 replicates), values &gt;50% are shown. The alpha shape parameter was optimized first and fixed for the bootstrap analyses with NNI and TBR branch swapping for further optimizations. Scale bars represent the inferred number of changes per site.</p>
<p>(0.20 MB PDF)</p>
</caption></supplementary-material><supplementary-material id="ppat.1000999.s006" mimetype="application/msword" position="float" xlink:href="info:doi/10.1371/journal.ppat.1000999.s006" xlink:type="simple"><label>Table S1</label><caption>
<p>Full names of species used in phylogenetic analyses and accession numbers for all gene sequences used in the work described here.</p>
<p>(0.05 MB DOC)</p>
</caption></supplementary-material></sec></body>
<back>
<ack>
<p>We thank Maria Delgadillo-Correa for technical assistance, Mary Anne Rubio for advice, Christopher Ryan for critical comments on the manuscript and our colleagues in the lab for helpful discussions and two anonymous referees for constructive comments.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="ppat.1000999-Shuman1"><label>1</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2002</year>             <article-title>What messenger RNA capping tells us about eukaryotic evolution.</article-title>             <source>Nat Rev Mol Cell Biol</source>             <volume>3</volume>             <fpage>619</fpage>             <lpage>625</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Cougot1"><label>2</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Cougot</surname><given-names>N</given-names></name>
<name name-style="western"><surname>van Dijk</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Babajko</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Seraphin</surname><given-names>B</given-names></name>
</person-group>             <year>2004</year>             <article-title>‘Cap-tabolism’.</article-title>             <source>Trends Biochem Sci</source>             <volume>29</volume>             <fpage>436</fpage>             <lpage>444</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Varani1"><label>3</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Varani</surname><given-names>G</given-names></name>
</person-group>             <year>1997</year>             <article-title>A cap for all occasions.</article-title>             <source>Structure</source>             <volume>5</volume>             <fpage>855</fpage>             <lpage>858</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Reddy1"><label>4</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Reddy</surname><given-names>R</given-names></name>
<name name-style="western"><surname>Singh</surname><given-names>R</given-names></name>
<name name-style="western"><surname>Shimba</surname><given-names>S</given-names></name>
</person-group>             <year>1992</year>             <article-title>Methylated cap structures in eukaryotic RNAs: structure, synthesis and functions.</article-title>             <source>Pharmacol Ther</source>             <volume>54</volume>             <fpage>249</fpage>             <lpage>267</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Banerjee1"><label>5</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Banerjee</surname><given-names>AK</given-names></name>
</person-group>             <year>1980</year>             <article-title>5′-terminal cap structure in eucaryotic messenger ribonucleic acids.</article-title>             <source>Microbiol Rev</source>             <volume>44</volume>             <fpage>175</fpage>             <lpage>205</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Furuichi1"><label>6</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Furuichi</surname><given-names>Y</given-names></name>
<name name-style="western"><surname>Shatkin</surname><given-names>AJ</given-names></name>
</person-group>             <year>1989</year>             <article-title>Characterization of cap structures.</article-title>             <source>Methods Enzymol</source>             <volume>180</volume>             <fpage>164</fpage>             <lpage>176</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Bangs1"><label>7</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Bangs</surname><given-names>JD</given-names></name>
<name name-style="western"><surname>Crain</surname><given-names>PF</given-names></name>
<name name-style="western"><surname>Hashizume</surname><given-names>T</given-names></name>
<name name-style="western"><surname>McCloskey</surname><given-names>JA</given-names></name>
<name name-style="western"><surname>Boothroyd</surname><given-names>JC</given-names></name>
</person-group>             <year>1992</year>             <article-title>Mass spectrometry of mRNA cap 4 from trypanosomatids reveals two novel nucleosides.</article-title>             <source>J Biol Chem</source>             <volume>267</volume>             <fpage>9805</fpage>             <lpage>9815</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Mair1"><label>8</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Mair</surname><given-names>G</given-names></name>
<name name-style="western"><surname>Ullu</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
</person-group>             <year>2000</year>             <article-title>Cotranscriptional cap 4 formation on the <italic>Trypanosoma brucei</italic> spliced leader RNA.</article-title>             <source>J Biol Chem</source>             <volume>275</volume>             <fpage>28994</fpage>             <lpage>28999</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Tschudi1"><label>9</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
<name name-style="western"><surname>Ullut</surname><given-names>E</given-names></name>
</person-group>             <year>2002</year>             <article-title>Unconventional rules of small nuclear RNA transcription and cap modification in trypanosomatids.</article-title>             <source>Gene Expr</source>             <volume>10</volume>             <fpage>3</fpage>             <lpage>16</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Arhin1"><label>10</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Arhin</surname><given-names>GK</given-names></name>
<name name-style="western"><surname>Li</surname><given-names>H</given-names></name>
<name name-style="western"><surname>Ullu</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
</person-group>             <year>2006</year>             <article-title>A protein related to the vaccinia virus cap-specific methyltransferase VP39 is involved in cap 4 modification in <italic>Trypanosoma brucei</italic>.</article-title>             <source>RNA</source>             <volume>12</volume>             <fpage>53</fpage>             <lpage>62</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Arhin2"><label>11</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Arhin</surname><given-names>GK</given-names></name>
<name name-style="western"><surname>Ullu</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
</person-group>             <year>2006</year>             <article-title>2′-O-methylation of position 2 of the trypanosome spliced leader cap 4 is mediated by a 48 kDa protein related to vaccinia virus VP39.</article-title>             <source>Mol Biochem Parasitol</source>             <volume>147</volume>             <fpage>137</fpage>             <lpage>139</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Mittra1"><label>12</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Mittra</surname><given-names>B</given-names></name>
<name name-style="western"><surname>Zamudio</surname><given-names>JR</given-names></name>
<name name-style="western"><surname>Bujnicki</surname><given-names>JM</given-names></name>
<name name-style="western"><surname>Stepinski</surname><given-names>J</given-names></name>
<name name-style="western"><surname>Darzynkiewicz</surname><given-names>E</given-names></name>
<etal/></person-group>             <year>2008</year>             <article-title>The TbMTr1 spliced leader RNA cap 1 2′-O-ribose methyltransferase from <italic>Trypanosoma brucei</italic> acts with substrate specificity.</article-title>             <source>J Biol Chem</source>             <volume>283</volume>             <fpage>3161</fpage>             <lpage>3172</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Zamudio1"><label>13</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Zamudio</surname><given-names>JR</given-names></name>
<name name-style="western"><surname>Mittra</surname><given-names>B</given-names></name>
<name name-style="western"><surname>Campbell</surname><given-names>DA</given-names></name>
<name name-style="western"><surname>Sturm</surname><given-names>NR</given-names></name>
</person-group>             <year>2009</year>             <article-title>Hypermethylated cap 4 maximizes <italic>Trypanosoma brucei</italic> translation.</article-title>             <source>Mol Microbiol</source>             <volume>72</volume>             <fpage>1100</fpage>             <lpage>1110</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Hausmann1"><label>14</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Hausmann</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Vivares</surname><given-names>CP</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2002</year>             <article-title>Characterization of the mRNA capping apparatus of the microsporidian parasite <italic>Encephalitozoon cuniculi</italic>.</article-title>             <source>J Biol Chem</source>             <volume>277</volume>             <fpage>96</fpage>             <lpage>103</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Pei1"><label>15</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Pei</surname><given-names>Y</given-names></name>
<name name-style="western"><surname>Schwer</surname><given-names>B</given-names></name>
<name name-style="western"><surname>Saiz</surname><given-names>J</given-names></name>
<name name-style="western"><surname>Fisher</surname><given-names>RP</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2001</year>             <article-title>RNA triphosphatase is essential in <italic>Schizosaccharomyces pombe</italic> and <italic>Candida albicans</italic>.</article-title>             <source>BMC Microbiol</source>             <volume>1</volume>             <fpage>29</fpage>          </element-citation></ref>
<ref id="ppat.1000999-Ho1"><label>16</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Ho</surname><given-names>CK</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2001</year>             <article-title>A yeast-like mRNA capping apparatus in <italic>Plasmodium falciparum</italic>.</article-title>             <source>Proc Natl Acad Sci U S A</source>             <volume>98</volume>             <fpage>3050</fpage>             <lpage>3055</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Changela1"><label>17</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Changela</surname><given-names>A</given-names></name>
<name name-style="western"><surname>Ho</surname><given-names>CK</given-names></name>
<name name-style="western"><surname>Martins</surname><given-names>A</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Mondragon</surname><given-names>A</given-names></name>
</person-group>             <year>2001</year>             <article-title>Structure and mechanism of the RNA triphosphatase component of mammalian mRNA capping enzyme.</article-title>             <source>EMBO J</source>             <volume>20</volume>             <fpage>2575</fpage>             <lpage>2586</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Sawaya1"><label>18</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Sawaya</surname><given-names>R</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2003</year>             <article-title>Mutational analysis of the guanylyltransferase component of Mammalian mRNA capping enzyme.</article-title>             <source>Biochemistry</source>             <volume>42</volume>             <fpage>8240</fpage>             <lpage>8249</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Hausmann2"><label>19</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Hausmann</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Altura</surname><given-names>MA</given-names></name>
<name name-style="western"><surname>Witmer</surname><given-names>M</given-names></name>
<name name-style="western"><surname>Singer</surname><given-names>SM</given-names></name>
<name name-style="western"><surname>Elmendorf</surname><given-names>HG</given-names></name>
<etal/></person-group>             <year>2005</year>             <article-title>Yeast-like mRNA capping apparatus in <italic>Giardia lamblia</italic>.</article-title>             <source>J Biol Chem</source>             <volume>280</volume>             <fpage>12077</fpage>             <lpage>12086</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Silva1"><label>20</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Silva</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Ullu</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Kobayashi</surname><given-names>R</given-names></name>
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
</person-group>             <year>1998</year>             <article-title>Trypanosome capping enzymes display a novel two-domain structure.</article-title>             <source>Mol Cell Biol</source>             <volume>18</volume>             <fpage>4612</fpage>             <lpage>4619</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Ruan1"><label>21</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Ruan</surname><given-names>JP</given-names></name>
<name name-style="western"><surname>Shen</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Ullu</surname><given-names>E</given-names></name>
<name name-style="western"><surname>Tschudi</surname><given-names>C</given-names></name>
</person-group>             <year>2007</year>             <article-title>Evidence for a capping enzyme with specificity for the trypanosome spliced leader RNA.</article-title>             <source>Mol Biochem Parasitol</source>             <volume>156</volume>             <fpage>246</fpage>             <lpage>254</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Takagi1"><label>22</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Takagi</surname><given-names>Y</given-names></name>
<name name-style="western"><surname>Sindkar</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Ekonomidis</surname><given-names>D</given-names></name>
<name name-style="western"><surname>Hall</surname><given-names>MP</given-names></name>
<name name-style="western"><surname>Ho</surname><given-names>CK</given-names></name>
</person-group>             <year>2007</year>             <article-title><italic>Trypanosoma brucei</italic> encodes a bifunctional capping enzyme essential for cap 4 formation on the spliced leader RNA.</article-title>             <source>J Biol Chem</source>             <volume>282</volume>             <fpage>15995</fpage>             <lpage>16005</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Stiller1"><label>23</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Stiller</surname><given-names>JW</given-names></name>
</person-group>             <year>2004</year>             <article-title>Emerging genomic and proteomic evidence on relationships among the animal, plant and fungal kingdoms.</article-title>             <source>Genomics Proteomics Bioinformatics</source>             <volume>2</volume>             <fpage>70</fpage>             <lpage>76</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Hampl1"><label>24</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Hampl</surname><given-names>V</given-names></name>
<name name-style="western"><surname>Hug</surname><given-names>L</given-names></name>
<name name-style="western"><surname>Leigh</surname><given-names>JW</given-names></name>
<name name-style="western"><surname>Dacks</surname><given-names>JB</given-names></name>
<name name-style="western"><surname>Lang</surname><given-names>BF</given-names></name>
<etal/></person-group>             <year>2009</year>             <article-title>Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic “supergroups”.</article-title>             <source>Proc Natl Acad Sci U S A</source>             <volume>106</volume>             <fpage>3859</fpage>             <lpage>3864</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Takagi2"><label>25</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Takagi</surname><given-names>T</given-names></name>
<name name-style="western"><surname>Moore</surname><given-names>CR</given-names></name>
<name name-style="western"><surname>Diehn</surname><given-names>F</given-names></name>
<name name-style="western"><surname>Buratowski</surname><given-names>S</given-names></name>
</person-group>             <year>1997</year>             <article-title>An RNA 5′-triphosphatase related to the protein tyrosine phosphatases.</article-title>             <source>Cell</source>             <volume>89</volume>             <fpage>867</fpage>             <lpage>873</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Lima1"><label>26</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Lima</surname><given-names>CD</given-names></name>
<name name-style="western"><surname>Wang</surname><given-names>LK</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>1999</year>             <article-title>Structure and mechanism of yeast RNA triphosphatase: an essential component of the mRNA capping apparatus.</article-title>             <source>Cell</source>             <volume>99</volume>             <fpage>533</fpage>             <lpage>543</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Minge1"><label>27</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Minge</surname><given-names>MA</given-names></name>
<name name-style="western"><surname>Silberman</surname><given-names>JD</given-names></name>
<name name-style="western"><surname>Orr</surname><given-names>RJ</given-names></name>
<name name-style="western"><surname>Cavalier-Smith</surname><given-names>T</given-names></name>
<name name-style="western"><surname>Shalchian-Tabrizi</surname><given-names>K</given-names></name>
<etal/></person-group>             <year>2009</year>             <article-title>Evolutionary position of breviate amoebae and the primary eukaryote divergence.</article-title>             <source>Proc Biol Sci</source>             <volume>276</volume>             <fpage>597</fpage>             <lpage>604</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Adl1"><label>28</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Adl</surname><given-names>SM</given-names></name>
<name name-style="western"><surname>Simpson</surname><given-names>AG</given-names></name>
<name name-style="western"><surname>Farmer</surname><given-names>MA</given-names></name>
<name name-style="western"><surname>Andersen</surname><given-names>RA</given-names></name>
<name name-style="western"><surname>Anderson</surname><given-names>OR</given-names></name>
<etal/></person-group>             <year>2005</year>             <article-title>The new higher level classification of eukaryotes with emphasis on the taxonomy of protists.</article-title>             <source>J Eukaryot Microbiol</source>             <volume>52</volume>             <fpage>399</fpage>             <lpage>451</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Carlton1"><label>29</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Carlton</surname><given-names>JM</given-names></name>
<name name-style="western"><surname>Hirt</surname><given-names>RP</given-names></name>
<name name-style="western"><surname>Silva</surname><given-names>JC</given-names></name>
<name name-style="western"><surname>Delcher</surname><given-names>AL</given-names></name>
<name name-style="western"><surname>Schatz</surname><given-names>M</given-names></name>
<etal/></person-group>             <year>2007</year>             <article-title>Draft genome sequence of the sexually transmitted pathogen <italic>Trichomonas vaginalis</italic>.</article-title>             <source>Science</source>             <volume>315</volume>             <fpage>207</fpage>             <lpage>212</lpage>          </element-citation></ref>
<ref id="ppat.1000999-SimoesBarbosa1"><label>30</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Simoes-Barbosa</surname><given-names>A</given-names></name>
<name name-style="western"><surname>Meloni</surname><given-names>D</given-names></name>
<name name-style="western"><surname>Wohlschlegel</surname><given-names>JA</given-names></name>
<name name-style="western"><surname>Konarska</surname><given-names>MM</given-names></name>
<name name-style="western"><surname>Johnson</surname><given-names>PJ</given-names></name>
</person-group>             <year>2008</year>             <article-title>Spliceosomal snRNAs in the unicellular eukaryote <italic>Trichomonas vaginalis</italic> are structurally conserved but lack a 5′-cap structure.</article-title>             <source>RNA</source>             <volume>14</volume>             <fpage>1617</fpage>             <lpage>1631</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Aurrecoechea1"><label>31</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Aurrecoechea</surname><given-names>C</given-names></name>
<name name-style="western"><surname>Brestelli</surname><given-names>J</given-names></name>
<name name-style="western"><surname>Brunk</surname><given-names>BP</given-names></name>
<name name-style="western"><surname>Carlton</surname><given-names>JM</given-names></name>
<name name-style="western"><surname>Dommer</surname><given-names>J</given-names></name>
<etal/></person-group>             <year>2009</year>             <article-title>GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens <italic>Giardia lamblia</italic> and <italic>Trichomonas vaginalis</italic>.</article-title>             <source>Nucleic Acids Res</source>             <volume>37</volume>             <fpage>D526</fpage>             <lpage>530</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Wang1"><label>32</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Wang</surname><given-names>SP</given-names></name>
<name name-style="western"><surname>Deng</surname><given-names>L</given-names></name>
<name name-style="western"><surname>Ho</surname><given-names>CK</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>1997</year>             <article-title>Phylogeny of mRNA capping enzymes.</article-title>             <source>Proc Natl Acad Sci U S A</source>             <volume>94</volume>             <fpage>9573</fpage>             <lpage>9578</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Shuman2"><label>33</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2001</year>             <article-title>Structure, mechanism, and evolution of the mRNA capping apparatus.</article-title>             <source>Prog Nucleic Acid Res Mol Biol</source>             <volume>66</volume>             <fpage>1</fpage>             <lpage>40</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Schwer1"><label>34</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Schwer</surname><given-names>B</given-names></name>
<name name-style="western"><surname>Lehman</surname><given-names>K</given-names></name>
<name name-style="western"><surname>Saha</surname><given-names>N</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2001</year>             <article-title>Characterization of the mRNA capping apparatus of <italic>Candida albicans</italic>.</article-title>             <source>J Biol Chem</source>             <volume>276</volume>             <fpage>1857</fpage>             <lpage>1864</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Hausmann3"><label>35</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Hausmann</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2005</year>             <article-title>Specificity and mechanism of RNA cap guanine-N2 methyltransferase (Tgs1).</article-title>             <source>J Biol Chem</source>             <volume>280</volume>             <fpage>4021</fpage>             <lpage>4024</lpage>          </element-citation></ref>
<ref id="ppat.1000999-SimoesBarbosa2"><label>36</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Simoes-Barbosa</surname><given-names>A</given-names></name>
<name name-style="western"><surname>Louly</surname><given-names>C</given-names></name>
<name name-style="western"><surname>Franco</surname><given-names>OL</given-names></name>
<name name-style="western"><surname>Rubio</surname><given-names>MA</given-names></name>
<name name-style="western"><surname>Alfonzo</surname><given-names>JD</given-names></name>
<etal/></person-group>             <year>2008</year>             <article-title>The divergent eukaryote <italic>Trichomonas vaginalis</italic> has an m7G cap methyltransferase capable of a single N2 methylation.</article-title>             <source>Nucleic Acids Res</source>             <volume>36</volume>             <fpage>6848</fpage>             <lpage>6858</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Liston1"><label>37</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Liston</surname><given-names>DR</given-names></name>
<name name-style="western"><surname>Carrero</surname><given-names>JC</given-names></name>
<name name-style="western"><surname>Johnson</surname><given-names>PJ</given-names></name>
</person-group>             <year>1999</year>             <article-title>Upstream regulatory sequences required for expression of the <italic>Trichomonas vaginalis</italic> alpha-succinyl CoA synthetase gene.</article-title>             <source>Mol Biochem Parasitol</source>             <volume>104</volume>             <fpage>323</fpage>             <lpage>329</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Liston2"><label>38</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Liston</surname><given-names>DR</given-names></name>
<name name-style="western"><surname>Lau</surname><given-names>AO</given-names></name>
<name name-style="western"><surname>Ortiz</surname><given-names>D</given-names></name>
<name name-style="western"><surname>Smale</surname><given-names>ST</given-names></name>
<name name-style="western"><surname>Johnson</surname><given-names>PJ</given-names></name>
</person-group>             <year>2001</year>             <article-title>Initiator recognition in a primitive eukaryote: IBP39, an initiator-binding protein from <italic>Trichomonas vaginalis</italic>.</article-title>             <source>Mol Cell Biol</source>             <volume>21</volume>             <fpage>7872</fpage>             <lpage>7882</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Roger1"><label>39</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Roger</surname><given-names>AJ</given-names></name>
<name name-style="western"><surname>Simpson</surname><given-names>AG</given-names></name>
</person-group>             <year>2009</year>             <article-title>Evolution: revisiting the root of the eukaryote tree.</article-title>             <source>Curr Biol</source>             <volume>19</volume>             <fpage>R165</fpage>             <lpage>167</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Benarroch1"><label>40</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Benarroch</surname><given-names>D</given-names></name>
<name name-style="western"><surname>Smith</surname><given-names>P</given-names></name>
<name name-style="western"><surname>Shuman</surname><given-names>S</given-names></name>
</person-group>             <year>2008</year>             <article-title>Characterization of a trifunctional mimivirus mRNA capping enzyme and crystal structure of the RNA triphosphatase domain.</article-title>             <source>Structure</source>             <volume>16</volume>             <fpage>501</fpage>             <lpage>512</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Zheng1"><label>41</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Zheng</surname><given-names>L</given-names></name>
<name name-style="western"><surname>Baumann</surname><given-names>U</given-names></name>
<name name-style="western"><surname>Reymond</surname><given-names>JL</given-names></name>
</person-group>             <year>2004</year>             <article-title>An efficient one-step site-directed and site-saturation mutagenesis protocol.</article-title>             <source>Nucleic Acids Res</source>             <volume>32</volume>             <fpage>e115</fpage>          </element-citation></ref>
<ref id="ppat.1000999-Clark1"><label>42</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Clark</surname><given-names>CG</given-names></name>
<name name-style="western"><surname>Diamond</surname><given-names>LS</given-names></name>
</person-group>             <year>2002</year>             <article-title>Methods for cultivation of luminal parasitic protists of clinical importance.</article-title>             <source>Clin Microbiol Rev</source>             <volume>15</volume>             <fpage>329</fpage>             <lpage>341</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Grosjean1"><label>43</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Grosjean</surname><given-names>H</given-names></name>
<name name-style="western"><surname>Droogmans</surname><given-names>L</given-names></name>
<name name-style="western"><surname>Roovers</surname><given-names>M</given-names></name>
<name name-style="western"><surname>Keith</surname><given-names>G</given-names></name>
</person-group>             <year>2007</year>             <article-title>Detection of enzymatic activity of transfer RNA modification enzymes using radiolabeled tRNA substrates.</article-title>             <source>Methods Enzymol</source>             <volume>425</volume>             <fpage>55</fpage>             <lpage>101</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Letunic1"><label>44</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Letunic</surname><given-names>I</given-names></name>
<name name-style="western"><surname>Doerks</surname><given-names>T</given-names></name>
<name name-style="western"><surname>Bork</surname><given-names>P</given-names></name>
</person-group>             <year>2009</year>             <article-title>SMART 6: recent updates and new developments.</article-title>             <source>Nucleic Acids Res</source>             <volume>37</volume>             <fpage>D229</fpage>             <lpage>232</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Larkin1"><label>45</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Larkin</surname><given-names>MA</given-names></name>
<name name-style="western"><surname>Blackshields</surname><given-names>G</given-names></name>
<name name-style="western"><surname>Brown</surname><given-names>NP</given-names></name>
<name name-style="western"><surname>Chenna</surname><given-names>R</given-names></name>
<name name-style="western"><surname>McGettigan</surname><given-names>PA</given-names></name>
<etal/></person-group>             <year>2007</year>             <article-title>Clustal W and Clustal X version 2.0.</article-title>             <source>Bioinformatics</source>             <volume>23</volume>             <fpage>2947</fpage>             <lpage>2948</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Galtier1"><label>46</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Galtier</surname><given-names>N</given-names></name>
<name name-style="western"><surname>Gouy</surname><given-names>M</given-names></name>
<name name-style="western"><surname>Gautier</surname><given-names>C</given-names></name>
</person-group>             <year>1996</year>             <article-title>SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny.</article-title>             <source>Comput Appl Biosci</source>             <volume>12</volume>             <fpage>543</fpage>             <lpage>548</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Abascal1"><label>47</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Abascal</surname><given-names>F</given-names></name>
<name name-style="western"><surname>Zardoya</surname><given-names>R</given-names></name>
<name name-style="western"><surname>Posada</surname><given-names>D</given-names></name>
</person-group>             <year>2005</year>             <article-title>ProtTest: selection of best-fit models of protein evolution.</article-title>             <source>Bioinformatics</source>             <volume>21</volume>             <fpage>2104</fpage>             <lpage>2105</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Le1"><label>48</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Le</surname><given-names>SQ</given-names></name>
<name name-style="western"><surname>Gascuel</surname><given-names>O</given-names></name>
</person-group>             <year>2008</year>             <article-title>An improved general amino acid replacement matrix.</article-title>             <source>Mol Biol Evol</source>             <volume>25</volume>             <fpage>1307</fpage>             <lpage>1320</lpage>          </element-citation></ref>
<ref id="ppat.1000999-Guindon1"><label>49</label><element-citation publication-type="journal" xlink:type="simple">             <person-group person-group-type="author">
<name name-style="western"><surname>Guindon</surname><given-names>S</given-names></name>
<name name-style="western"><surname>Delsuc</surname><given-names>F</given-names></name>
<name name-style="western"><surname>Dufayard</surname><given-names>JF</given-names></name>
<name name-style="western"><surname>Gascuel</surname><given-names>O</given-names></name>
</person-group>             <year>2009</year>             <article-title>Estimating maximum likelihood phylogenies with PhyML.</article-title>             <source>Methods Mol Biol</source>             <volume>537</volume>             <fpage>113</fpage>             <lpage>137</lpage>          </element-citation></ref>
</ref-list>

</back>
</article>