TY - GEN
T1 - Time-based sampling methods for detecting helpful reviews
AU - Saptono, Ristu
AU - Mine, Tsunenori
N1 - Funding Information:
ACKNOWLEDGEMENT This work is partly supported by JSPS KAKENHI Grant Numbers JP18K18656, JP19KK0257, JP20H04300 and JP20H01728.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - Product reviews describe customer opinions and experiences to products. Better opinions and experiences in the reviews more attract and help people who want to buy the products. The reviews, including such factors, are called helpful reviews. Many studies have been conducted to detect helpful reviews and proposed many useful factors, such as review-related factors, product-related factors, and reviewer-related factors. Meanwhile, the elapsed time of reviews has been used as a factor in detecting helpful reviews but never considered as sampling methods, despite that it is an essential factor to determine the freshness of the reviews, which influence the people being interested in the product. In this paper, we propose time-based sampling methods, which determine the sample size as small as possible in detecting helpful reviews with high accuracy. To investigate the effect of the time-based sampling methods in detecting helpful reviews, we conducted extensive experiments comparing with total sampling and simple random sampling, using two machine learning methods: XGBoost and CNN which involve text and numerical factors. Experimental results illustrate the validity of the proposed methods. Significantly, in large datasets, our proposed sampling methods outperform the other sampling methods.
AB - Product reviews describe customer opinions and experiences to products. Better opinions and experiences in the reviews more attract and help people who want to buy the products. The reviews, including such factors, are called helpful reviews. Many studies have been conducted to detect helpful reviews and proposed many useful factors, such as review-related factors, product-related factors, and reviewer-related factors. Meanwhile, the elapsed time of reviews has been used as a factor in detecting helpful reviews but never considered as sampling methods, despite that it is an essential factor to determine the freshness of the reviews, which influence the people being interested in the product. In this paper, we propose time-based sampling methods, which determine the sample size as small as possible in detecting helpful reviews with high accuracy. To investigate the effect of the time-based sampling methods in detecting helpful reviews, we conducted extensive experiments comparing with total sampling and simple random sampling, using two machine learning methods: XGBoost and CNN which involve text and numerical factors. Experimental results illustrate the validity of the proposed methods. Significantly, in large datasets, our proposed sampling methods outperform the other sampling methods.
UR - http://www.scopus.com/inward/record.url?scp=85114401501&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85114401501&partnerID=8YFLogxK
U2 - 10.1109/WIIAT50758.2020.00076
DO - 10.1109/WIIAT50758.2020.00076
M3 - Conference contribution
AN - SCOPUS:85114401501
T3 - Proceedings - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
SP - 508
EP - 513
BT - Proceedings - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
A2 - He, Jing
A2 - Purohit, Hemant
A2 - Huang, Guangyan
A2 - Gao, Xiaoying
A2 - Deng, Ke
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
Y2 - 14 December 2020 through 17 December 2020
ER -