TY - GEN
T1 - Estimation of Mental Health Quality of Life using Visual Information during Interaction with a Communication Agent
AU - Nakagawa, Satoshi
AU - Yonekura, Shogo
AU - Kanazawa, Hoshinori
AU - Nishikawa, Satoshi
AU - Kuniyoshi, Yasuo
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/8
Y1 - 2020/8
N2 - It is essential for a monitoring system or a communication robot that interacts with an elderly person to accurately understand the user's state and generate actions based on their condition. To ensure elderly welfare, quality of life (QOL) is a useful indicator for determining human physical suffering and mental and social activities in a comprehensive manner. In this study, we hypothesize that visual information is useful for extracting high-dimensional information on QOL from data collected by an agent while interacting with a person. We propose a QOL estimation method to integrate facial expressions, head fluctuations, and eye movements that can be extracted as visual information during the interaction with the communication agent. Our goal is to implement a multiple feature vectors learning estimator that incorporates convolutional 3D to learn spatiotemporal features. However, there is no database required for QOL estimation. Therefore, we implement a free communication agent and construct our database based on information collected through interpersonal experiments using the agent. To verify the proposed method, we focus on the estimation of the mental health QOL scale, which is the most difficult to estimate among the eight scales that compose QOL based on a previous study. We compare the four estimation accuracies: single-modal learning using each of the three features, i.e., facial expressions, head fluctuations, and eye movements and multiple feature vectors learning integrating all the three features. The experimental results show that multiple feature vectors learning has fewer estimation errors than all the other single-modal learning, which uses each feature separately. The experimental results for evaluating the difference between the estimated QOL score by the proposed method and the actual QOL score calculated by the conventional method also show that the average error is less than 10 points and, thus, the proposed system can estimate the QOL score. Thus, it is clear that the proposed new approach for estimating human conditions can improve the quality of human-robot interactions and personalized monitoring.
AB - It is essential for a monitoring system or a communication robot that interacts with an elderly person to accurately understand the user's state and generate actions based on their condition. To ensure elderly welfare, quality of life (QOL) is a useful indicator for determining human physical suffering and mental and social activities in a comprehensive manner. In this study, we hypothesize that visual information is useful for extracting high-dimensional information on QOL from data collected by an agent while interacting with a person. We propose a QOL estimation method to integrate facial expressions, head fluctuations, and eye movements that can be extracted as visual information during the interaction with the communication agent. Our goal is to implement a multiple feature vectors learning estimator that incorporates convolutional 3D to learn spatiotemporal features. However, there is no database required for QOL estimation. Therefore, we implement a free communication agent and construct our database based on information collected through interpersonal experiments using the agent. To verify the proposed method, we focus on the estimation of the mental health QOL scale, which is the most difficult to estimate among the eight scales that compose QOL based on a previous study. We compare the four estimation accuracies: single-modal learning using each of the three features, i.e., facial expressions, head fluctuations, and eye movements and multiple feature vectors learning integrating all the three features. The experimental results show that multiple feature vectors learning has fewer estimation errors than all the other single-modal learning, which uses each feature separately. The experimental results for evaluating the difference between the estimated QOL score by the proposed method and the actual QOL score calculated by the conventional method also show that the average error is less than 10 points and, thus, the proposed system can estimate the QOL score. Thus, it is clear that the proposed new approach for estimating human conditions can improve the quality of human-robot interactions and personalized monitoring.
UR - http://www.scopus.com/inward/record.url?scp=85095745538&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095745538&partnerID=8YFLogxK
U2 - 10.1109/RO-MAN47096.2020.9223606
DO - 10.1109/RO-MAN47096.2020.9223606
M3 - Conference contribution
AN - SCOPUS:85095745538
T3 - 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
SP - 1321
EP - 1327
BT - 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
Y2 - 31 August 2020 through 4 September 2020
ER -