TY - GEN
T1 - A dynamic-static approach of model fusion for document similarity computation
AU - Li, Jiyi
AU - Asano, Yasuhito
AU - Shimizu, Toshiyuki
AU - Yoshikawa, Masatoshi
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - The semantic similarity of text document pairs can be used for valuable applications. There are various existing basic models proposed for representing document content and computing document similarity. Each basic model performs difference in different scenarios. Existing model selection or fusion approaches generate improved models based on these basic models on the granularity of document collection. These improved models are static for all document pairs and may be only proper for some of the document pairs. We propose a dynamic idea of model fusion, and an approach based on a Dynamic-Static Fusion Model (DSFM) on the granularity of document pairs, which is dynamic for each document pair. The dynamic module in DSFM learns to rank the basic models to predict the best basic model for a given document pair. We propose a model categorization method to construct ideal model labels of document pairs for learning in this dynamic module. The static module in DSFM is based on linear regression. We also propose a model selection method to select appropriate candidate basic models for fusion and improve the performance. The experiments on public document collections which contain paragraph pairs and sentence pairs with human-rated similarity illustrate the effectiveness of our approach.
AB - The semantic similarity of text document pairs can be used for valuable applications. There are various existing basic models proposed for representing document content and computing document similarity. Each basic model performs difference in different scenarios. Existing model selection or fusion approaches generate improved models based on these basic models on the granularity of document collection. These improved models are static for all document pairs and may be only proper for some of the document pairs. We propose a dynamic idea of model fusion, and an approach based on a Dynamic-Static Fusion Model (DSFM) on the granularity of document pairs, which is dynamic for each document pair. The dynamic module in DSFM learns to rank the basic models to predict the best basic model for a given document pair. We propose a model categorization method to construct ideal model labels of document pairs for learning in this dynamic module. The static module in DSFM is based on linear regression. We also propose a model selection method to select appropriate candidate basic models for fusion and improve the performance. The experiments on public document collections which contain paragraph pairs and sentence pairs with human-rated similarity illustrate the effectiveness of our approach.
UR - http://www.scopus.com/inward/record.url?scp=84949984497&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84949984497&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-26190-4_24
DO - 10.1007/978-3-319-26190-4_24
M3 - Conference contribution
AN - SCOPUS:84949984497
SN - 9783319261898
SN - 9783319261898
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 353
EP - 368
BT - Web Information Systems Engineering – WISE 2015 - 16th International Conference, Proceedings
A2 - Chen, Shu-Ching
A2 - Li, Tao
A2 - Wang, Hua
A2 - Zhang, Yanchun
A2 - Cellary, Wojciech
A2 - Wang, Dingding
A2 - Cellary, Wojciech
A2 - Chen, Shu-Ching
A2 - Li, Tao
A2 - Wang, Dingding
A2 - Wang, Jianyong
A2 - Wang, Jianyong
A2 - Wang, Hua
A2 - Zhang, Yanchun
PB - Springer Verlag
T2 - 16th International Conference on Web Information Systems Engineering, WISE 2015
Y2 - 1 November 2015 through 3 November 2015
ER -