TY - GEN
T1 - Research Paper Search Using a Topic-Based Boolean Query Search and a General Query-Based Ranking Model
AU - Fukuda, Satoshi
AU - Tomiura, Yoichi
AU - Ishita, Emi
N1 - Funding Information:
Acknowledgements. This work was supported by JSPS KAKENHI Grant Number JP15H01721. We thank Stuart Jenkinson, PhD, from Edanz Group (www.edanzediting.com/ac) for editing a draft of this manuscript.
PY - 2019
Y1 - 2019
N2 - When conducting a search for research papers, the search should return comprehensive results related to the user’s query. In general, a user inputs a Boolean query that reflects the information need, and the search engine ranks the research papers based on the query. However, it is difficult to anticipate all possible terms that authors of relevant papers might have used. Moreover, general query-based ranking methods emphasize how to rank the relevant documents at the top of the results, but require some means of guaranteeing the comprehensiveness of the results. Therefore, two ranking methods that consider the comprehensiveness of relevant papers are proposed. The first uses a topic-based Boolean query search. This search converts every word in the abstract set and query into a topic via topic analysis by Latent Dirichlet Allocation (LDA) and conducts a search at the topic level. The topic assigned to synonyms of a search term is expected to be the same as that assigned to the search term. Each paper is ranked based on the number of times it is matched with each topic-based Boolean query search executed for various LDA parameter settings. The second is a hybrid method that emphasizes better results from our topic-based ranking result and a general query-based ranking result. This method is based on the observation that the paper sets retrieved by our method and by a general ranking method will be different. Through experiments using the NTCIR-1 and -2 datasets, the effectiveness of our topic-based and hybrid methods are demonstrated.
AB - When conducting a search for research papers, the search should return comprehensive results related to the user’s query. In general, a user inputs a Boolean query that reflects the information need, and the search engine ranks the research papers based on the query. However, it is difficult to anticipate all possible terms that authors of relevant papers might have used. Moreover, general query-based ranking methods emphasize how to rank the relevant documents at the top of the results, but require some means of guaranteeing the comprehensiveness of the results. Therefore, two ranking methods that consider the comprehensiveness of relevant papers are proposed. The first uses a topic-based Boolean query search. This search converts every word in the abstract set and query into a topic via topic analysis by Latent Dirichlet Allocation (LDA) and conducts a search at the topic level. The topic assigned to synonyms of a search term is expected to be the same as that assigned to the search term. Each paper is ranked based on the number of times it is matched with each topic-based Boolean query search executed for various LDA parameter settings. The second is a hybrid method that emphasizes better results from our topic-based ranking result and a general query-based ranking result. This method is based on the observation that the paper sets retrieved by our method and by a general ranking method will be different. Through experiments using the NTCIR-1 and -2 datasets, the effectiveness of our topic-based and hybrid methods are demonstrated.
UR - http://www.scopus.com/inward/record.url?scp=85077130439&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077130439&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-27618-8_5
DO - 10.1007/978-3-030-27618-8_5
M3 - Conference contribution
AN - SCOPUS:85077130439
SN - 9783030276171
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 65
EP - 75
BT - Database and Expert Systems Applications - 30th International Conference, DEXA 2019, Proceedings
A2 - Hartmann, Sven
A2 - Küng, Josef
A2 - Anderst-Kotsis, Gabriele
A2 - Khalil, Ismail
A2 - Chakravarthy, Sharma
A2 - Tjoa, A Min
PB - Springer
T2 - 30th International Conference on Database and Expert Systems Applications, DEXA 2019
Y2 - 26 August 2019 through 29 August 2019
ER -