TY - JOUR
T1 - Reconstructing phylogenetic trees of prokaryote genomes by randomly sampling oligopeptides
AU - Maruyama, Osamu
AU - Matsuda, Akiko
AU - Kuhara, Satoru
N1 - Funding Information:
This work was supported in part by Research for the Future Program of JSPS, and Grant-in-Aid for Scientific Research on Priority Areas (C) and Young Scientists (B) of MEXT.
PY - 2005
Y1 - 2005
N2 - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 % of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.
AB - In this paper, we propose a method for reconstructing phylogenetic trees of a given set of prokaryote organisms by randomly sampling relatively small oligopeptides of a fixed length from their complete proteomes. For each of the organisms, a vector of frequencies of those sampled oligopeptides is generated and used as a building block in reconstructing phylogenetic trees. By this procedure, phylogenetic trees are generated independently, and a consensus tree of the resulting trees is obtained. We have applied our method to a set of 109 organisms, including 16 Archaea, 87 Bacteria, and 6 Eukarya, using less 10 % of all the 3,200,000 oligopeptides of length 5. Our consensus tree agrees with the tree of Bergey's Manual in most of the basic taxa. In addition, they have almost the same quality as the trees of the same organisms reconstructed using all the 20K oligopeptides of length K = 5 and 6 given by Qi et al. Thus we can conclude that, the frequencies of a relatively small number of oligopeptides of length 5, even if those oligopeptides are determined in a random method, has phylogenetic information almost equivalent to the frequencies of all the oligopeptides of length 5 or 6.
UR - http://www.scopus.com/inward/record.url?scp=25144453345&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=25144453345&partnerID=8YFLogxK
U2 - 10.1007/11428848_116
DO - 10.1007/11428848_116
M3 - Conference article
AN - SCOPUS:25144453345
SN - 0302-9743
VL - 3515
SP - 911
EP - 918
JO - Lecture Notes in Computer Science
JF - Lecture Notes in Computer Science
IS - II
T2 - 5th International Conference on Computational Science - ICCS 2005
Y2 - 22 May 2005 through 25 May 2005
ER -