<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1475-2875-7-90</ui>
   <ji>1475-2875</ji>
   <fm>
      <dochead>Methodology</dochead>
      <bibl>
         <title>
            <p>A structural annotation resource for the selection of putative target proteins in the malaria parasite</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Joubert</snm>
               <fnm>Yolandi</fnm>
               <insr iid="I1"/>
               <email>yolandi.joubert@gmail.com</email>
            </au>
            <au id="A2" ca="yes">
               <snm>Joubert</snm>
               <fnm>Fourie</fnm>
               <insr iid="I1"/>
               <email>fourie.joubert@up.ac.za</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Bioinformatics and Computational Biology Unit, Department of Biochemistry, University of Pretoria, Pretoria, 0002, South Africa</p>
            </ins>
         </insg>
         <source>Malaria Journal</source>
         <issn>1475-2875</issn>
         <pubdate>2008</pubdate>
         <volume>7</volume>
         <issue>1</issue>
         <fpage>90</fpage>
         <url>http://www.malariajournal.com/content/7/1/90</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18500983</pubid>
               <pubid idtype="doi">10.1186/1475-2875-7-90</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>09</day>
               <month>1</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>23</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>23</day>
               <month>5</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Joubert and Joubert; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Protein structure plays a pivotal role in elucidating mechanisms of parasite functioning and drug resistance. Moreover, protein structure aids the determination of protein function, which can together with the structure be used to identify novel drug targets in the parasite. However, various structural features in <it>Plasmodium falciparum </it>proteins complicate the experimental determination of protein structures. Limited similarity to proteins in the Protein Data Bank and the shortage of solved protein structures in the malaria parasite necessitate genome-scale structural annotation of <it>P. falciparum </it>proteins. Additionally, the annotation of a range of structural features facilitates the identification of suitable targets for experimental and computational studies.</p>
            </sec>
            <sec>
               <st>
                  <p>Methods</p>
               </st>
               <p>An integrated structural annotation system was developed and applied to <it>P. falciparum</it>, <it>Plasmodium vivax </it>and <it>Plasmodium yoelii</it>. The annotation included searches for sequence similarity, patterns and domains in addition to the following predictions: secondary structure, transmembrane helices, protein disorder, low complexity, coiled-coils and small molecule interactions. Subsequently, candidate proteins for further structural studies were identified based on the annotated structural features.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>The annotation results are accessible through a web interface, enabling users to select groups of proteins which fulfil multiple criteria pertaining to structural and functional features <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Analysis of features in the <it>P. falciparum </it>proteome showed that protein-interacting proteins contained a higher percentage of predicted disordered residues than non-interacting proteins. Proteins interacting with 10 or more proteins have a disordered content concentrated in the range of 60&#8211;100%, while the disorder distribution for proteins having only one interacting partner, was more evenly spread.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>A series of <it>P. falciparum </it>protein targets for experimental structure determination, comparative modelling and <it>in silico </it>docking studies were putatively identified. The system is available for public use, where researchers may identify proteins by querying with multiple physico-chemical, sequence similarity and interaction features.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="refman"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Malaria parasite resistance to therapeutic drugs such as sulfadoxine and pyrimethamine have increased significantly during the past two decades <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. Following the rise in resistance, there has been a pressing need to understand the mechanism of drug resistance and develop novel anti-malarial drugs. Protein structure has previously been used to elucidate the mechanism of resistance in <it>Plasmodium falciparum </it><abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. Furthermore, inhibitors can be designed from structure <abbrgrp><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>. As resistance to existing drugs is a globally occurring phenomenon, new information regarding the structure and function of the proteins in especially the <it>P. falciparum </it>genome is of importance. However, various features of the parasite genome and proteome complicate functional and structural characterization studies, including a high AT-content and the presence of low complexity regions and inserts <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp>.</p>
         <p>Structural and functional information is limited for <it>P. falciparum </it>proteins. To illustrate, a search of the PDB using "falciparum" as keyword retrieved 210 structures at the time of writing. Once sequences with more than 90% sequence identity were removed, 103 structures remained <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. Regarding functional annotation, Gene Ontology terms <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> have been assigned manually to around 40% of all <it>P. falciparum </it>gene products <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>. Almost 60% percent of the proteins do not have sufficient similarity to known proteins and therefore no function can be assigned to them. In short, 4% of <it>P. falciparum </it>proteins have experimental three dimensional structures assigned, and 60% of the proteome is described as hypothetical. Furthermore, the amount of redundant <it>P. falciparum </it>proteins in the PDB is significant.</p>
         <p>Generating experimental data to provide evidence of protein structure and function is expensive, difficult and slow. Conversely, predictive computational methods are fast and applicable to whole proteomes. Although they are less reliable than experimental results, predictions can identify proteins of interest and determine their suitability for experimental studies <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>. Moreover, knowledge of the structural features of proteins guides experiment design <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>.</p>
         <p>Methods used for three dimensional protein structure prediction are primarily based on homology transfer. Structure is more conserved than sequence and therefore distantly related sequences often have the same or very similar structures <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Computational methods for structure feature prediction make use of machine learning, statistical methods and physical properties of amino acid sequences. Typical computer-based methods for structural annotation include the prediction of secondary structure, transmembrane helices, low complexity, disorder, coiled-coils, and 3D structure.</p>
         <p>Integrating these annotations are important for three major reasons: Different databases cover different sets of proteins; prediction methods have different strengths and weaknesses and finally, biological conclusions about function and structure can be derived more accurately considering as much information as possible about a certain sequence. Therefore, many meta-servers and integrated databases for genome-scale protein structural and functional annotation have been generated. Proteome annotation with regard to structure and function is important for comparative studies <abbrgrp><abbr bid="B16">16</abbr></abbrgrp> and for selecting sets of proteins of particular interest from an organism <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>This study entailed the development of an automated structural annotation pipeline for the malaria parasite <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and the semi-automated annotation of additional features in the <it>P. falciparum</it>, <it>P, vivax </it>and <it>P. yoelii </it>genomes. In addition, the number of proteins with specific predicted features was calculated. Finally, lists of putative candidates for further experimental and <it>in silico </it>structural studies were compiled. It is not intended to compete with the established PlasmoDB database <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, but attempts to provide a supplementary specialized environment for performing complex queries based on structural and other properties, enabling researchers to select molecules with specific properties for further investigation. It does make use of information from, and provide links to the PlasmoDB site.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <p>Development was done in Python, utilizing the Zope web application framework with a PostgreSQL database. Protein sequences were obtained from PlasmoDB release 5 <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Data sources were the <it>Plasmodium falciparum </it><abbrgrp><abbr bid="B8">8</abbr></abbrgrp>, <it>Plasmodium vivax </it><abbrgrp><abbr bid="B18">18</abbr></abbrgrp> and <it>Plasmodium yoelii </it><abbrgrp><abbr bid="B19">19</abbr></abbrgrp> genome sequencing projects. All annotated proteins were used. Analyses were performed on a 64&#215; CPU Linux cluster. Protein statistics were gathered using Pepstats from EMBOSS <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>. BLAST searches were done using NCBI BLAST 2.2.10 <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> against the PDB <abbrgrp><abbr bid="B22">22</abbr></abbrgrp> with a E-value cut-off value of 20. HMMPfam from the HMMER package <abbrgrp><abbr bid="B23">23</abbr></abbrgrp> was run against the Superfamily database <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> with an e-value cut-off of 1e-<sup>1 </sup>for protein structural family classification. Threading was done with Threader 3 <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, using secondary structure predictions from PsiPred <abbrgrp><abbr bid="B26">26</abbr></abbrgrp>. Only sequences shorter than 400 residues were used. Transmembrane helix predictions were done using TMHMM2 <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. The EMBOSS program, SigCleave was used to predict signal peptides, and Paircoil2 <abbrgrp><abbr bid="B28">28</abbr></abbrgrp> was employed to predict coiled-coil regions. Secondary structure predictions were performed with three iterations of PsiPred 2.5. Protein disorder was predicted with the Disprot/VSL2 predictors <abbrgrp><abbr bid="B29">29</abbr></abbrgrp>. SMID-BLAST 1.02 was used to analyse possible protein-ligand interactions (Unleashed Informatics), no e-value cutoffs were implemented. Motifs were analysed with pscan and patmatmotifs from EMBOSS. For protein-protein interactions, data from high-throughput yeast-two hybrid experiments <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> were annotated to the malaria sequences. Proteins previously predicted to be exported out of the red blood cell <abbrgrp><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr></abbrgrp> were annotated.</p>
         <p>For the identification of candidate proteins for homology modelling, sequence similarity with a protein in the PDB was required. Contrastingly, candidate selection for X-ray crystallization required that proteins did not have sequence similarity with proteins in the PDB. Protein sequences with more than 30% predicted coiled-coils, disorder, transmembrane regions and signal peptides were eliminated. The SMID-BLAST predictions were used to identify proteins to which small molecules bind and which might be suitable for <it>in silico </it>docking studies. In addition, these proteins had to have a crystal structure or good sequence similarity in the PDB.</p>
      </sec>
      <sec>
         <st>
            <p>Results and Discussion</p>
         </st>
         <sec>
            <st>
               <p>The structural annotation system</p>
            </st>
            <p>Using the web interface, proteins can be searched by keywords, by browsing per chromosome and by designing complex inclusion and exclusion queries using an intuitive check-box and form interface. Following selection and filtering, an individual protein's result page starts with sequence, followed by statistics as calculated by pepstats. The next section provides the user with a summary image displaying database coverage, motifs, disordered regions, coiled-coils, low complexity and transmembrane helices (Figure <figr fid="F1">1</figr>).</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>An example of a part of the protein information view for the <it>P. falciparum </it>protein, bifunctional dihydrofolate-reductase/thymidylate synthase</p>
               </caption>
               <text>
                  <p><b>An example of a part of the protein information view for the <it>P. falciparum </it>protein, bifunctional dihydrofolate-reductase/thymidylate synthase.</b> The sequence is colored by physical-chemical properties. Protein statistics shown include molecular weight (MW), average residue weight (ARW), charge, iso-electric point (IP), molar extinction coefficient (MEC), extinction coefficient at 1 mg/ml (EC) and improbability of expression in inclusion bodies (IEIB). SMID (red triangles), Pfam (green bars) and Prosite (yellow bars) hits are graphically indicated along the length of the protein.</p>
               </text>
               <graphic file="1475-2875-7-90-1"/>
            </fig>
            <p>Each of the subsequent sections lists the results of a specific analysis. The results include start and end positions on the query sequence, scores, e-values, links to other databases and descriptions. A graph constructed by Matplotlib then displays the confidence values for helix, strand and disorder predictions over the length of the protein sequence. BLAST PDB hits as well as Threader results are displayed in a tabular format, with links to the relevant protein structures. Similarly, protein-protein and small molecule interactions are reported in tabular form, together with the relevant links. Pfam domains and Superfamily results are represented graphically together with links to the relevant entries. Subsequently, patterns and disordered regions are graphically displayed. Metabolic pathway information is summarized, and sequence similarity between <it>Plasmodium </it>species is presented.</p>
         </sec>
         <sec>
            <st>
               <p>Analysis of the <it>P. falciparum </it>proteome</p>
            </st>
            <p>A short summary of the results discussed here is provided in Table <tblr tid="T1">1</tblr>. Twenty-seven percent of proteins had BLAST/PDB hits with at least 25% identity to the hit. One third of these proteins (10% of the proteome) had at least two-thirds of their sequence covered by a PDB match. Almost 20% of the sequences had at least one-third of the length covered by a PDB hit. An additional 413 sequences had Superfamily hits with a score of 100 or better. Therefore, an estimate of proteins which could be assigned to an existing fold is 1 224 or 23%. Finally, out of 2,462 proteins subjected to threading, 423 had alignments with Z-scores better than 3.95. Out of these, about 100 did not have BLAST-PDB matches with e-values smaller than 0.5, which covered more than 30% of the query sequence.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>A summary of selected features calculated for annotated proteins in <it>Plasmodium falciparum </it>(a total of 5,411 proteins were analysed).</p>
               </caption>
               <tblbdy cols="2">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Feature</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Occurrence</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="2">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>BLAST vs. PDB hits with at least 25% identity</p>
                     </c>
                     <c ca="center">
                        <p>27%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Hits vs. Pfam with E-value &lt; 1 &#215; 10<sup>-15</sup></p>
                     </c>
                     <c ca="center">
                        <p>32%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>No Pfam hits</p>
                     </c>
                     <c ca="center">
                        <p>43%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Threading Z score > 3.95</p>
                     </c>
                     <c ca="center">
                        <p>8%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>One or more predicted coiled-coil regions</p>
                     </c>
                     <c ca="center">
                        <p>10%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Predicted to be transported out of the RBC (Pexel)</p>
                     </c>
                     <c ca="center">
                        <p>5%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Predicted to contain at least one transmembrane helix</p>
                     </c>
                     <c ca="center">
                        <p>30%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Predicted small molecule binding by SMID</p>
                     </c>
                     <c ca="center">
                        <p>22%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Protein-protein interaction by yeast-two hybrid results</p>
                     </c>
                     <c ca="center">
                        <p>15%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Predicted &#8804; 40% disorder or no regular secondary structure</p>
                     </c>
                     <c ca="center">
                        <p>60%</p>
                     </c>
                  </r>
                  <r>
                     <c ca="left">
                        <p>Mean percentage low complexity per sequence</p>
                     </c>
                     <c ca="center">
                        <p>16%</p>
                     </c>
                  </r>
               </tblbdy>
            </tbl>
            <p>Thirty-two percent of proteins had hits in Pfam below an e-value of 1 &#215; 10<sup>-15</sup>. At least 43% of all sequences had no hits with families in Pfam. There were 1,987 sequences with hits in Pfam with an e-value smaller than 1 &#215; 10<sup>-10</sup>. An additional 332 sequences had PRINTS hits. Thus, a total of 2 319 sequences or 43% of the annotated proteome could be assigned to functional families making use of the Pfam and PRINTS databases. Almost 200 proteins had Superfamily hits with an e-value smaller than 1 &#215; 10<sup>-3</sup>. Of these, 650 sequences did not have Pfam or PRINTS matches.</p>
            <p>Ten percent of proteins had one or more predicted coiled-coil region, 5% are predicted to be transported out of the red blood cell based on the presence of the Pexel motifs, and about 30% of proteins were predicted to contain at least one transmembrane helix. At least 22% of proteins were predicted to bind to small molecules by SMID-BLAST. Almost 15% of the proteins interact with other proteins according to the high-throughput yeast-two hybrid experiments. Sixty percent of the proteins were predicted to contain at least 40% intrinsic disorder or no regular secondary structure.</p>
            <p>As with other genomes, the most abundant transmembrane proteins contain only one transmembrane helix <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. The amount of transmembrane proteins decrease as the amount of membrane spanning helices increase, with the exception of 6-tm and 11-tm proteins which are slightly more than the portion of 5-tm and 10-tm proteins, respectively. The correlation between intrinsic disorder and interacting proteins was investigated. The mean percentage disorder in interacting sequences is 61%, while the mean percentage disorder in non-interacting proteins is 44% and the overall mean percentage disorder for all sequences is 48%. In agreement with previous studies of interacting proteins in human and their disorder content, <it>P. falciparum </it>interacting proteins contain higher intrinsic disorder content than non-interacting proteins. Because disorder in a protein makes it more flexible, it was expected that the disorder content would increase with the number of interacting partners. For proteins interacting with only one other protein, the predicted disorder varies from 4% to 100%. The majority of interacting proteins interact with less than 10 other proteins. As the amount of proteins decreases with an increasing number of interacting partners, the range of variation in disorder in the proteins also decreases, as expected. The ranges tend to span higher percentages of disorder as the amount of interacting partners increase.</p>
         </sec>
         <sec>
            <st>
               <p>Inter-species comparisons</p>
            </st>
            <p>The proteins from the <it>P. falciparum </it>length distribution have a longer tail than the other two species, and <it>P. yoelii </it>has a more symmetrical length distribution than the other species. The mean length for <it>P. vivax </it>is 630 with a standard deviation of 576 amino acids and the mean length for <it>P. yoelii </it>is 420 with a standard deviation of 450. The proteins in <it>P. yoelii </it>vary less in length than in the other two species, with <it>P. falciparum </it>showing the most variation. Asparagine is the most abundant amino acid in <it>P. falciparum </it>and <it>P. yoelii</it>, and Lysine in <it>P. vivax</it>. Although lysine is the most abundant amino acid in <it>P. vivax</it>, it should be noted that lysine is less abundant in <it>P. vivax </it>(9%) than in the other two species (11.5%). <it>Plasmodium vivax </it>contains on average twice as many alanine and glycine as the other two species. Overall, 26% of residues in <it>P. vivax </it>are tiny (A, C, G, S, T), in comparison to the 18% and 19% tiny residues contained within <it>P. falciparum </it>and <it>P. yoelii</it>, respectively.</p>
            <p><it>Plasmodium vivax </it>contains more proteins with small percentages of low complexity. Although <it>P. yoelii </it>and <it>P. falciparum </it>contain the same amount of proteins with predicted low complexity regions, the proportion of <it>P. yoelii </it>proteins is much lower than for <it>P. falciparum</it>. <it>P. yoelii </it>and <it>P. falciparum </it>have similar proportions of disorder and order-promoting amino acids, whereas <it>P. vivax </it>has proportionally more disorder-favouring amino acids and less order-promoting amino acids. The average percentage low complexity per sequence is 16% in <it>P. falciparum</it>, 10% in <it>P. vivax</it>, and 12% in <it>P. yoelii</it>. No low complexity is predicted for 27%, 19% and 13% percent of the sequences in <it>P. yoelii</it>, <it>P. vivax </it>and <it>P. falciparum</it>, respectively. <it>P. yoelii </it>and <it>P. falciparum </it>have an equal portion of transmembrane proteins, while <it>P. vivax </it>has less predicted transmembrane proteins. <it>P. falciparum </it>has more 2-tm, 3-tm, 4-tm, 6-tm and 9-tm proteins than the other two species. <it>P. vivax </it>has slightly more 8-tm proteins than <it>P. yoelii </it>and <it>P. falciparum </it>and <it>P. yoelii </it>has the most 1-tm proteins.</p>
         </sec>
         <sec>
            <st>
               <p>Identification of potential molecules for further study</p>
            </st>
            <p>Tables containing putative candidates possibly suitable for homology modelling can be viewed through the web interface <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. These proteins contain PDB matches with e-values better than 1 &#215; 10<sup>-20 </sup>and which have more than 70% of their sequence covered by the PDB match. The cut-off sequence identity was set to 25%. Therefore, these tables contain proteins for which high quality models could possibly be obtained through automatic model building. Separate tables contain interacting proteins, proteins with Pfam domains and uncharacterized proteins <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>.</p>
            <p>Proteins possibly suitable for <it>in silico </it>docking studies can also be accessed through the web interface <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. These proteins were selected based on the presence of predicted small molecule binding sites and the availability of a 3D structure. Interacting proteins were separated from non-interacting proteins.</p>
            <p>Possible targets for experimental structure determination are available for proteins with a Pfam domain <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> and for proteins without a Pfam domain <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. A lack of significant BLAST hits to entries in the PDB formed part of the basis for the putative identification of possible new targets for X-ray crystallography. For these, priority categories were determined, which are explained in Table <tblr tid="T2">2</tblr>.</p>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>The number of targets suggested for further study using experimental structural elucidation techniques in each priority category (PC, ranked 1 &#8211; 6) after the relevant elimination step.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>
                           <b>Priority Class (PC)</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>PDB E-value range</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Nr of proteins</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>Tm + disorder</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>CC+LC+SP</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>a</b>
                        </p>
                     </c>
                     <c ca="center">
                        <p>
                           <b>b</b>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC1</p>
                     </c>
                     <c ca="center">
                        <p>No PDB matches</p>
                     </c>
                     <c ca="center">
                        <p>139</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC2</p>
                     </c>
                     <c ca="center">
                        <p>E-value > 10</p>
                     </c>
                     <c ca="center">
                        <p>174</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC3</p>
                     </c>
                     <c ca="center">
                        <p>10 >= E-value > 5</p>
                     </c>
                     <c ca="center">
                        <p>352</p>
                     </c>
                     <c ca="center">
                        <p>34</p>
                     </c>
                     <c ca="center">
                        <p>31</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>22</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC4</p>
                     </c>
                     <c ca="center">
                        <p>5 >= E-value > 3</p>
                     </c>
                     <c ca="center">
                        <p>332</p>
                     </c>
                     <c ca="center">
                        <p>35</p>
                     </c>
                     <c ca="center">
                        <p>19</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC5</p>
                     </c>
                     <c ca="center">
                        <p>3 >= E-value > 1</p>
                     </c>
                     <c ca="center">
                        <p>810</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="center">
                        <p>51</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>39</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>PC6</p>
                     </c>
                     <c ca="center">
                        <p>1 >= E-value > 0.5</p>
                     </c>
                     <c ca="center">
                        <p>529</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>58</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>48</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Tm+disorder refers to the amount of proteins in each group after the transmembrane and disorder filtering step. CC+LC+SP refers to the proteins in each group after coiled-coils, low complexity and signal peptide filtering step. Group a indicates proteins containing a Pfam functional domain. Group b indicates proteins without a Pfam functional domain.</p>
               </tblfn>
            </tbl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>In order to allow researchers to select groups of proteins which fulfil certain criteria with regard to structural and functional features, a semi-automated structural annotation of selected species of the malaria parasite as performed, and a web-based resource with query functionality was developed. This tool was used to gather statistics regarding a series of structural and functional characteristics. Furthermore, a series of putative candidate proteins for homology modelling, crystallization and docking studies were generated.</p>
         <p>It is important to realize that the results presented her are dependent on the genome data and gene predictions available at the time of analysis. In the case of <it>Plasmodium </it>falciparum, a recent article has highlighted the shortcomings in the current state of gene prediction for malaria parasites, based on cDNA analysis <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>. Furthermore, this study is based on <it>P. falciparum </it>data from PlasmoDB 5.0, and a draft re-annotation of this genome has recently taken place. It is planned to incorporate the relevant results as soon as possible, Also, the <it>P. vivax </it>data should be regarded as preliminary as the genome is still unfinished, with a publication expected soon <abbrgrp><abbr bid="B39">39</abbr></abbrgrp>. It is hoped that this web-based resource may be valuable for researchers aiming to identify malaria proteins with specific combinations of sequence, structural and interaction features for further studies.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>FJ conceived the project, obtained funding and supervised the study. YJ performed the software, database and interface development, investigated the occurrence of the different features described, performed the inter-species comparison and compiled the lists of possible targets for further studies. Both authors prepared the manuscript.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The project was supported by a grant from the South African National Research Foundation (NRF). Yolandi Joubert received a bursary from the NRF. We are grateful to Ayton Meintjes, Tjaart de Beer, Charles Hefer, Gordon Wells and Hamilton Ganesan for their technical and intellectual contributions.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>MalPort Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://malport.bi.up.ac.za:7070</url>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Sequence variation of the hydroxymethyldihydropterin pyrophosphokinase: dihydropteroate synthase gene in lines of the human malaria parasite, Plasmodium falciparum, with differing resistance to sulfadoxine</p>
            </title>
            <aug>
               <au>
                  <snm>Brooks</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Read</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Watkins</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Sims</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Hyde</snm>
                  <fnm>JE</fnm>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1994</pubdate>
            <volume>224</volume>
            <fpage>397</fpage>
            <lpage>405</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1432-1033.1994.00397.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">7925353</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Evidence that a point mutation in dihydrofolate reductase-thymidylate synthase confers resistance to pyrimethamine in falciparum malaria</p>
            </title>
            <aug>
               <au>
                  <snm>Peterson</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Walliker</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Wellems</snm>
                  <fnm>TE</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1988</pubdate>
            <volume>85</volume>
            <fpage>9114</fpage>
            <lpage>9118</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">282674</pubid>
                  <pubid idtype="pmpid" link="fulltext">2904149</pubid>
                  <pubid idtype="doi">10.1073/pnas.85.23.9114</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Elucidation of sulfadoxine resistance with structural models of the bifunctional Plasmodium falciparum dihydropterin pyrophosphokinase-dihydropteroate synthase</p>
            </title>
            <aug>
               <au>
                  <snm>de Beer</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Louw</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Joubert</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Bioorg Med Chem</source>
            <pubdate>2006</pubdate>
            <volume>14</volume>
            <fpage>4433</fpage>
            <lpage>4443</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.bmc.2006.02.035</pubid>
                  <pubid idtype="pmpid" link="fulltext">16517168</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Malarial (Plasmodium falciparum) dihydrofolate reductase-thymidylate synthase: structural basis for antifolate resistance and development of effective inhibitors</p>
            </title>
            <aug>
               <au>
                  <snm>Yuthavong</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yuvaniyama</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chitnumsub</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Vanichtanankul</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chusacultanachai</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tarnchompoo</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Vilaivan</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Kamchonwongpaisan</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Parasitology</source>
            <pubdate>2005</pubdate>
            <volume>130</volume>
            <fpage>249</fpage>
            <lpage>259</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1017/S003118200400664X</pubid>
                  <pubid idtype="pmpid">15796007</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Glutathione reductase of the malarial parasite Plasmodium falciparum: crystal structure and inhibitor development</p>
            </title>
            <aug>
               <au>
                  <snm>Sarma</snm>
                  <fnm>GN</fnm>
               </au>
               <au>
                  <snm>Savvides</snm>
                  <fnm>SN</fnm>
               </au>
               <au>
                  <snm>Becker</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Schirmer</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schirmer</snm>
                  <fnm>RH</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>328</volume>
            <fpage>893</fpage>
            <lpage>907</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(03)00347-4</pubid>
                  <pubid idtype="pmpid" link="fulltext">12729762</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Triosephosphate isomerase from Plasmodium falciparum: the crystal structure provides insights into antimalarial drug design</p>
            </title>
            <aug>
               <au>
                  <snm>Velanker</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Ray</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Gokhale</snm>
                  <fnm>RS</fnm>
               </au>
               <au>
                  <snm>Suma</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Balaram</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Balaram</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Murthy</snm>
                  <fnm>MR</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1997</pubdate>
            <volume>5</volume>
            <fpage>751</fpage>
            <lpage>761</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(97)00230-X</pubid>
                  <pubid idtype="pmpid">9261072</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Genome sequence of the human malaria parasite Plasmodium falciparum</p>
            </title>
            <aug>
               <au>
                  <snm>Gardner</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Fung</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Berriman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hyman</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Carlton</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Pain</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nelson</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Bowman</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Paulsen</snm>
                  <fnm>IT</fnm>
               </au>
               <au>
                  <snm>James</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Eisen</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Rutherford</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Craig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kyes</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Chan</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Nene</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Shallom</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Angiuoli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Pertea</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Selengut</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Haft</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Mather</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Vaidya</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Fairlamb</snm>
                  <fnm>AH</fnm>
               </au>
               <au>
                  <snm>Fraunholz</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Roos</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Ralph</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>McFadden</snm>
                  <fnm>GI</fnm>
               </au>
               <au>
                  <snm>Cummings</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Subramanian</snm>
                  <fnm>GM</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Carucci</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Newbold</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Davis</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Barrell</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>419</volume>
            <fpage>498</fpage>
            <lpage>511</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01097</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368864</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Characterization of P-type ATPase 3 in Plasmodium falciparum</p>
            </title>
            <aug>
               <au>
                  <snm>Rozmajzl</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Kimura</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Woodrow</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Krishna</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Meade</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Mol Biochem Parasitol</source>
            <pubdate>2001</pubdate>
            <volume>116</volume>
            <fpage>117</fpage>
            <lpage>126</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0166-6851(01)00319-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">11522345</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>PDB Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://www.rcsb.org</url>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Parasites are GO</p>
            </title>
            <aug>
               <au>
                  <snm>Berriman</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Aslett</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hall</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ivens</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Parasitol</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>463</fpage>
            <lpage>464</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1471-4922(01)02083-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">11642257</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Automatic target selection for structural genomics on eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hegyi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Acton</snm>
                  <fnm>TB</fnm>
               </au>
               <au>
                  <snm>Montelione</snm>
                  <fnm>GT</fnm>
               </au>
               <au>
                  <snm>Rost</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>2004</pubdate>
            <volume>56</volume>
            <fpage>188</fpage>
            <lpage>200</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.20012</pubid>
                  <pubid idtype="pmpid" link="fulltext">15211504</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Knowledge-based selection of targets for structural genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>2002</pubdate>
            <volume>15</volume>
            <fpage>169</fpage>
            <lpage>183</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/15.3.169</pubid>
                  <pubid idtype="pmpid" link="fulltext">11932488</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>On the oligomeric state of DJ-1 protein and its mutants associated with Parkinson Disease. A combined computational and in vitro study</p>
            </title>
            <aug>
               <au>
                  <snm>Herrera</snm>
                  <fnm>FE</fnm>
               </au>
               <au>
                  <snm>Zucchelli</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jezierska</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lavina</snm>
                  <fnm>ZS</fnm>
               </au>
               <au>
                  <snm>Gustincich</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Carloni</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2007</pubdate>
            <volume>282</volume>
            <fpage>24905</fpage>
            <lpage>24914</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M701013200</pubid>
                  <pubid idtype="pmpid" link="fulltext">17504761</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>The relation between the divergence of sequence and structure in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Lesk</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>EMBO J</source>
            <pubdate>1986</pubdate>
            <volume>5</volume>
            <fpage>823</fpage>
            <lpage>826</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1166865</pubid>
                  <pubid idtype="pmpid">3709526</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Comparing function and structure between entire proteomes</p>
            </title>
            <aug>
               <au>
                  <snm>Liu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Rost</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2001</pubdate>
            <volume>10</volume>
            <fpage>1970</fpage>
            <lpage>1979</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2374214</pubid>
                  <pubid idtype="pmpid" link="fulltext">11567088</pubid>
                  <pubid idtype="doi">10.1110/ps.10101</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>PlasmoDB v5: new looks, new genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Stoeckert</snm>
                  <fnm>CJ</fnm>
                  <suf>Jr.</suf>
               </au>
               <au>
                  <snm>Fischer</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kissinger</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Heiges</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Aurrecoechea</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Gajria</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Roos</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Trends Parasitol</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>543</fpage>
            <lpage>546</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.pt.2006.09.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">17029963</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Plasmodium vivax Genome Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://www.tigr.org/tdb/e2k1/pva1/</url>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii</p>
            </title>
            <aug>
               <au>
                  <snm>Carlton</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Angiuoli</snm>
                  <fnm>SV</fnm>
               </au>
               <au>
                  <snm>Suh</snm>
                  <fnm>BB</fnm>
               </au>
               <au>
                  <snm>Kooij</snm>
                  <fnm>TW</fnm>
               </au>
               <au>
                  <snm>Pertea</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Silva</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Ermolaeva</snm>
                  <fnm>MD</fnm>
               </au>
               <au>
                  <snm>Allen</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Selengut</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Koo</snm>
                  <fnm>HL</fnm>
               </au>
               <au>
                  <snm>Peterson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Pop</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kosack</snm>
                  <fnm>DS</fnm>
               </au>
               <au>
                  <snm>Shumway</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Bidwell</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Shallom</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>van Aken</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Riedmuller</snm>
                  <fnm>SB</fnm>
               </au>
               <au>
                  <snm>Feldblyum</snm>
                  <fnm>TV</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>JK</fnm>
               </au>
               <au>
                  <snm>Quackenbush</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sedegah</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Shoaibi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Cummings</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Florens</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Yates</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Raine</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Sinden</snm>
                  <fnm>RE</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Cunningham</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Preiser</snm>
                  <fnm>PR</fnm>
               </au>
               <au>
                  <snm>Bergman</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Vaidya</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>van Lin</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Janse</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Waters</snm>
                  <fnm>AP</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>HO</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>OR</fnm>
               </au>
               <au>
                  <snm>Salzberg</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Venter</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Fraser</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Hoffman</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Gardner</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Carucci</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>419</volume>
            <fpage>512</fpage>
            <lpage>519</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01099</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368865</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>EMBOSS: the European Molecular Biology Open Software Suite</p>
            </title>
            <aug>
               <au>
                  <snm>Rice</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Longden</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Bleasby</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>276</fpage>
            <lpage>277</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(00)02024-2</pubid>
                  <pubid idtype="pmpid" link="fulltext">10827456</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Gapped BLAST and PSI-BLAST: a new generation of protein database search programs</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Madden</snm>
                  <fnm>TL</fnm>
               </au>
               <au>
                  <snm>Schaffer</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1997</pubdate>
            <volume>25</volume>
            <fpage>3389</fpage>
            <lpage>3402</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">146917</pubid>
                  <pubid idtype="pmpid" link="fulltext">9254694</pubid>
                  <pubid idtype="doi">10.1093/nar/25.17.3389</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Protein Data Bank</p>
            </title>
            <aug>
               <au>
                  <snm>Abola</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Bernstein</snm>
                  <fnm>FC</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Koetzle</snm>
                  <fnm>TF</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Crystallographic Databases - Information Content, Software Systems, Scientific Applications</source>
            <publisher>Bonn/Cambridge/Chester, Data Commission of the International Union of Crystallography</publisher>
            <editor>Allen FH, Bergerhoff G and Sievers R</editor>
            <pubdate>1987</pubdate>
            <fpage>107</fpage>
            <lpage>132</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Profile hidden Markov models</p>
            </title>
            <aug>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>1998</pubdate>
            <volume>14</volume>
            <fpage>755</fpage>
            <lpage>763</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/14.9.755</pubid>
                  <pubid idtype="pmpid" link="fulltext">9918945</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure</p>
            </title>
            <aug>
               <au>
                  <snm>Gough</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Karplus</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Hughey</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>313</volume>
            <fpage>903</fpage>
            <lpage>919</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2001.5080</pubid>
                  <pubid idtype="pmpid" link="fulltext">11697912</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>THREADER : Protein Sequence Threading by Double Dynamic Programming</p>
            </title>
            <aug>
               <au>
                  <snm>Jones</snm>
                  <fnm>DT</fnm>
               </au>
            </aug>
            <source>Computational Methods in Molecular Biology</source>
            <publisher>Elsevier</publisher>
            <editor>Salzberg S, Searls D and Kasif S</editor>
            <pubdate>1998</pubdate>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The PSIPRED protein structure prediction server</p>
            </title>
            <aug>
               <au>
                  <snm>McGuffin</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Bryson</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>DT</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2000</pubdate>
            <volume>16</volume>
            <fpage>404</fpage>
            <lpage>405</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/16.4.404</pubid>
                  <pubid idtype="pmpid" link="fulltext">10869041</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Krogh</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Larsson</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>von</snm>
                  <fnm>HG</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2001</pubdate>
            <volume>305</volume>
            <fpage>567</fpage>
            <lpage>580</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.4315</pubid>
                  <pubid idtype="pmpid" link="fulltext">11152613</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Paircoil2: improved prediction of coiled coils from sequence</p>
            </title>
            <aug>
               <au>
                  <snm>McDonnell</snm>
                  <fnm>AV</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Keating</snm>
                  <fnm>AE</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>22</volume>
            <fpage>356</fpage>
            <lpage>358</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti797</pubid>
                  <pubid idtype="pmpid" link="fulltext">16317077</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>DisProt: the Database of Disordered Proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Sickmeier</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hamilton</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>LeGall</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Vacic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cortese</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Tantos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Szabo</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tompa</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Uversky</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Obradovic</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Dunker</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D786</fpage>
            <lpage>D793</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1751543</pubid>
                  <pubid idtype="pmpid" link="fulltext">17145717</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl893</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>A protein interaction network of the malaria parasite Plasmodium falciparum</p>
            </title>
            <aug>
               <au>
                  <snm>LaCount</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Vignali</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chettier</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Phansalkar</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bell</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hesselberth</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Schoenfeld</snm>
                  <fnm>LW</fnm>
               </au>
               <au>
                  <snm>Ota</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Sahasrabudhe</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Kurschner</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fields</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>RE</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2005</pubdate>
            <volume>438</volume>
            <fpage>103</fpage>
            <lpage>107</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature04104</pubid>
                  <pubid idtype="pmpid" link="fulltext">16267556</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>The pathogenic basis of malaria</p>
            </title>
            <aug>
               <au>
                  <snm>Miller</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Baruch</snm>
                  <fnm>DI</fnm>
               </au>
               <au>
                  <snm>Marsh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Doumbo</snm>
                  <fnm>OK</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>415</volume>
            <fpage>673</fpage>
            <lpage>679</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/415673a</pubid>
                  <pubid idtype="pmpid" link="fulltext">11832955</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Targeting malaria virulence and remodeling proteins to the host erythrocyte</p>
            </title>
            <aug>
               <au>
                  <snm>Marti</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Good</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Rug</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Knuepfer</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Cowman</snm>
                  <fnm>AF</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>2004</pubdate>
            <volume>306</volume>
            <fpage>1930</fpage>
            <lpage>1933</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1102452</pubid>
                  <pubid idtype="pmpid" link="fulltext">15591202</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms</p>
            </title>
            <aug>
               <au>
                  <snm>Wallin</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>von</snm>
                  <fnm>HG</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1998</pubdate>
            <volume>7</volume>
            <fpage>1029</fpage>
            <lpage>1038</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">2143985</pubid>
                  <pubid idtype="pmpid" link="fulltext">9568909</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>MalPort Homology Modeling Candidates Set 1 Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://malport.bi.up.ac.za:8080/Annotation/modeling</url>
         </bibl>
         <bibl id="B35">
            <title>
               <p>MalPort Homology Modeling Candidates Set 2 Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://malport.bi.up.ac.za:8080/Annotation/modeling2</url>
         </bibl>
         <bibl id="B36">
            <title>
               <p>MalPort Experimental Candidates Set 1 Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://malport.bi.up.ac.za:8080/Annotation/experimental1</url>
         </bibl>
         <bibl id="B37">
            <title>
               <p>MalPort Experimental Candidates Set 2 Web Site</p>
            </title>
            <pubdate>2008</pubdate>
            <url>http://malport.bi.up.ac.za:8080/Annotation/experimental2</url>
         </bibl>
         <bibl id="B38">
            <title>
               <p>cDNA sequences reveal considerable gene prediction inaccuracy in the Plasmodium falciparum genome</p>
            </title>
            <aug>
               <au>
                  <snm>Lu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Jiang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Ding</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Mu</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Valenzuela</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Ribeiro</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Su</snm>
                  <fnm>XZ</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2007</pubdate>
            <volume>8</volume>
            <fpage>255</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1978503</pubid>
                  <pubid idtype="pmpid" link="fulltext">17662120</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-8-255</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>The Plasmodium vivax genome sequencing project</p>
            </title>
            <aug>
               <au>
                  <snm>Carlton</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>Trends Parasitol</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <fpage>227</fpage>
            <lpage>231</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1471-4922(03)00066-7</pubid>
                  <pubid idtype="pmpid" link="fulltext">12763429</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
