TY - GEN

T1 - Discovery of tree structured patterns using Markov chain Monte Carlo method

AU - Okamoto, Yasuhiro

AU - Koyanagi, Kensuke

AU - Shoudai, Takayoshi

AU - Maruyama, Osamu

N1 - Publisher Copyright:
© 2014 IADIS.

PY - 2014

Y1 - 2014

N2 - A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern which can express a tree-structure common to given unordered trees. A TC-pattern has some special vertices, called contractible vertex, into which every uncommon connected substructure is merged by edge contractions. In this paper, we propose a probabilistic method for computing a binary classification problem on tree-structured data. Given a positive set P and a negative set N of unordered trees with vertex labels on a finite alphabet, the problem is to find meaningful and optimal TC-patterns that classify P and N with high statistical measures. We formalize this problem as a multiple optimization problem, and propose a probabilistic method for computing it by employing enumeration algorithms for TC-patterns and Markov chain Monte Carlo method. In addition, as a theoretical aspect of this problem, we show the hardness of approximability of it. Finally, we show the experimental results of our method on glycan structure data.

AB - A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern which can express a tree-structure common to given unordered trees. A TC-pattern has some special vertices, called contractible vertex, into which every uncommon connected substructure is merged by edge contractions. In this paper, we propose a probabilistic method for computing a binary classification problem on tree-structured data. Given a positive set P and a negative set N of unordered trees with vertex labels on a finite alphabet, the problem is to find meaningful and optimal TC-patterns that classify P and N with high statistical measures. We formalize this problem as a multiple optimization problem, and propose a probabilistic method for computing it by employing enumeration algorithms for TC-patterns and Markov chain Monte Carlo method. In addition, as a theoretical aspect of this problem, we show the hardness of approximability of it. Finally, we show the experimental results of our method on glycan structure data.

UR - http://www.scopus.com/inward/record.url?scp=84944051162&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944051162&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84944051162

T3 - Proceedings of the 7th IADIS International Conference Information Systems 2014, IS 2014

SP - 95

EP - 102

BT - Proceedings of the 7th IADIS International Conference Information Systems 2014, IS 2014

A2 - Nunes, Miguel Baptista

A2 - Rodrigues, Luis

A2 - Powell, Philip

A2 - Isaias, Pedro

PB - IADIS

T2 - 7th IADIS International Conference on Information Systems, IS 2014

Y2 - 28 February 2014 through 2 March 2014

ER -