TY - JOUR
T1 - PPSampler2
T2 - Predicting protein complexes more accurately and efficiently by sampling
AU - Widita, Chasanah Kusumastuti
AU - Maruyama, Osamu
N1 - Funding Information:
The publication fee was funded by the Institute of Mathematics for Industry at Kyushu University. This article has been published as part of BMC Systems Biology Volume 7 Supplement 6, 2013: Selected articles from the 24th International Conference on Genome Informatics (GIW2013). The full contents of the supplement are available online at http://www.biomedcentral.com/bmcsystbiol/supplements/7/S6.
Publisher Copyright:
© 2013 Widita and Maruyama.
PY - 2013
Y1 - 2013
N2 - The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26% higher than those of the other tools. In addition, about 82% of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.
AB - The problem of predicting sets of components of heteromeric protein complexes is a challenging problem in Systems Biology. There have been many tools proposed to predict those complexes. Among them, PPSampler, a protein complex prediction algorithm based on the Metropolis-Hastings algorithm, is reported to outperform other tools. In this work, we improve PPSampler by refining scoring functions and a proposal distribution used inside the algorithm so that predicted clusters are more accurate as well as the resulting algorithm runs faster. The new version is called PPSampler2. In computational experiments, PPSampler2 is shown to outperform other tools including PPSampler. The F-measure score of PPSampler2 is 0.67, which is at least 26% higher than those of the other tools. In addition, about 82% of the predicted clusters that are unmatched with any known complexes are statistically significant on the biological process aspect of Gene Ontology. Furthermore, the running time is reduced to twenty minutes, which is 1/24 of that of PPSampler.
UR - http://www.scopus.com/inward/record.url?scp=84908544846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908544846&partnerID=8YFLogxK
U2 - 10.1186/1752-0509-7-S6-S14
DO - 10.1186/1752-0509-7-S6-S14
M3 - Article
C2 - 24565288
AN - SCOPUS:84908544846
SN - 1752-0509
VL - 7
JO - BMC systems biology
JF - BMC systems biology
M1 - S14
ER -