TY - GEN
T1 - A keyword recommendation method using CorKeD words and its application to earth science data
AU - Ishida, Youichi
AU - Shimizu, Toshiyuki
AU - Yoshikawa, Masatoshi
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - In various research domains, data providers themselves annotate their own data with keywords from a controlled vocabulary. However, since selecting keywords requires extensive knowledge of the domain and the controlled vocabulary, even data providers have difficulty in selecting appropriate keywords from the vocabulary. Therefore, we propose a method for recommending relevant keywords in a controlled vocabulary to data providers. We focus on a keyword definition, and calculate the similarity between an abstract text of data and the keyword definition. Moreover, considering that there are unnecessary words in the calculation, we extract CorKeD (Corpus-based Keyword Decisive) words from a target domain corpus so that we can measure the similarity appropriately. We conduct an experiment on earth science data, and verify the effectiveness of extracting the CorKeD words, which are the terms that better characterize the domain.
AB - In various research domains, data providers themselves annotate their own data with keywords from a controlled vocabulary. However, since selecting keywords requires extensive knowledge of the domain and the controlled vocabulary, even data providers have difficulty in selecting appropriate keywords from the vocabulary. Therefore, we propose a method for recommending relevant keywords in a controlled vocabulary to data providers. We focus on a keyword definition, and calculate the similarity between an abstract text of data and the keyword definition. Moreover, considering that there are unnecessary words in the calculation, we extract CorKeD (Corpus-based Keyword Decisive) words from a target domain corpus so that we can measure the similarity appropriately. We conduct an experiment on earth science data, and verify the effectiveness of extracting the CorKeD words, which are the terms that better characterize the domain.
UR - http://www.scopus.com/inward/record.url?scp=84958035378&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84958035378&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-28940-3_8
DO - 10.1007/978-3-319-28940-3_8
M3 - Conference contribution
AN - SCOPUS:84958035378
SN - 9783319289397
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 96
EP - 108
BT - Information Retrieval Technology - 11th Asia Information Retrieval Societies Conference, AIRS 2015, Proceedings
A2 - Scholer, Falk
A2 - Zuccon, Guido
A2 - Geva, Shlomo
A2 - Sun, Aixin
A2 - Joho, Hideo
A2 - Zhang, Peng
PB - Springer Verlag
T2 - 11th Asia Information Retrieval Societies Conference, AIRS 2015
Y2 - 2 December 2015 through 4 December 2015
ER -