TY - JOUR
T1 - Supervised enzyme network inference from the integration of genomic data and chemical information
AU - Yamanishi, Yoshihiro
AU - Vert, Jean Philippe
AU - Kanehisa, Minoru
N1 - Funding Information:
Y.Y. and M.K. are supported by grants from the Ministry of Education, Culture, Sports, Science and Technology of Japan, the Japan Society for the Promotion of Science and the Japan Science and Technology Corporation. J.P.V. acknowledges the support of NIH grant R33HG003070-01. The computational resource was provided by the Bioinformatics Center, Institute for Chemical Research, Kyoto University. This collaboration was also supported by the French–Japanese Sakura grant.
PY - 2005/6
Y1 - 2005/6
N2 - Motivation: The metabolic network is an important biological network which relates enzyme proteins and chemical compounds. A large number of metabolic pathways remain unknown nowadays, and many enzymes are missing even in known metabolic pathways. There is, therefore, an incentive to develop methods to reconstruct the unknown parts of the metabolic network and to identify genes coding for missing enzymes. Results: This paper presents new methods to infer enzyme networks from the integration of multiple genomic data and chemical information, in the framework of supervised graph inference. The originality of the methods is the introduction of chemical compatibility as a constraint for refining the network predicted by the network inference engine. The chemical compatibility between two enzymes is obtained automatically from the information encoded by their Enzyme Commission (EC) numbers. The proposed methods are tested and compared on their ability to infer the enzyme network of the yeast Saccharomyces cerevisiae from four datasets for enzymes with assigned EC numbers: gene expression data, protein localization data, phylogenetic profiles and chemical compatibility information. It is shown that the prediction accuracy of the network reconstruction consistently improves owing to the introduction of chemical constraints, the use of a supervised approach and the weighted integration of multiple datasets. Finally, we conduct a comprehensive prediction of a global enzyme network consisting of all enzyme candidate proteins of the yeast to obtain new biological findings.
AB - Motivation: The metabolic network is an important biological network which relates enzyme proteins and chemical compounds. A large number of metabolic pathways remain unknown nowadays, and many enzymes are missing even in known metabolic pathways. There is, therefore, an incentive to develop methods to reconstruct the unknown parts of the metabolic network and to identify genes coding for missing enzymes. Results: This paper presents new methods to infer enzyme networks from the integration of multiple genomic data and chemical information, in the framework of supervised graph inference. The originality of the methods is the introduction of chemical compatibility as a constraint for refining the network predicted by the network inference engine. The chemical compatibility between two enzymes is obtained automatically from the information encoded by their Enzyme Commission (EC) numbers. The proposed methods are tested and compared on their ability to infer the enzyme network of the yeast Saccharomyces cerevisiae from four datasets for enzymes with assigned EC numbers: gene expression data, protein localization data, phylogenetic profiles and chemical compatibility information. It is shown that the prediction accuracy of the network reconstruction consistently improves owing to the introduction of chemical constraints, the use of a supervised approach and the weighted integration of multiple datasets. Finally, we conduct a comprehensive prediction of a global enzyme network consisting of all enzyme candidate proteins of the yeast to obtain new biological findings.
UR - http://www.scopus.com/inward/record.url?scp=29144446142&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=29144446142&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bti1012
DO - 10.1093/bioinformatics/bti1012
M3 - Article
C2 - 15961492
AN - SCOPUS:29144446142
SN - 1367-4803
VL - 21
SP - i468-i477
JO - Bioinformatics
JF - Bioinformatics
IS - SUPPL. 1
ER -