TY - GEN
T1 - Learning Cross-Modal Factors from Multimodal Physiological Signals for Emotion Recognition
AU - Ishikawa, Yuichi
AU - Kobayashi, Nao
AU - Naruse, Yasushi
AU - Nakamura, Yugo
AU - Ishida, Shigemi
AU - Mine, Tsunenori
AU - Arakawa, Yutaka
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2024
Y1 - 2024
N2 - Understanding user emotion is essential for Human-AI Interaction (HAI). Thus far, many approaches have been studied to recognize emotion from signals of various physiological modalities such as cardiac activity and skin conductance. However, little attention has been paid to the fact that physiological signals are influenced by and reflect various factors that have little or no association with emotion. While emotion is a cross-modal factor that triggers responses across multiple physiological modalities, features used in existing approaches also reflect modality-specific factors that affect only a single modality and have little association with emotion. To address this, we propose an approach to extract features that exclusively reflect cross-modal factors from multimodal physiological signals. Our approach introduces a multilayer RNN with two types of layers: multiple Modality-Specific Layers (MSLs) for modeling physiological activity in individual modalities and a single Cross-Modal Layer (CML) for modeling the process by which emotion affects physiological activity. By having all MSLs update their hidden states using the CML hidden states, our RNN causes the CML to learn cross-modal factors. Using real physiological signals, we confirmed that the features extracted by our RNN reflected emotions to a significantly greater extent than the features of existing approaches.
AB - Understanding user emotion is essential for Human-AI Interaction (HAI). Thus far, many approaches have been studied to recognize emotion from signals of various physiological modalities such as cardiac activity and skin conductance. However, little attention has been paid to the fact that physiological signals are influenced by and reflect various factors that have little or no association with emotion. While emotion is a cross-modal factor that triggers responses across multiple physiological modalities, features used in existing approaches also reflect modality-specific factors that affect only a single modality and have little association with emotion. To address this, we propose an approach to extract features that exclusively reflect cross-modal factors from multimodal physiological signals. Our approach introduces a multilayer RNN with two types of layers: multiple Modality-Specific Layers (MSLs) for modeling physiological activity in individual modalities and a single Cross-Modal Layer (CML) for modeling the process by which emotion affects physiological activity. By having all MSLs update their hidden states using the CML hidden states, our RNN causes the CML to learn cross-modal factors. Using real physiological signals, we confirmed that the features extracted by our RNN reflected emotions to a significantly greater extent than the features of existing approaches.
UR - http://www.scopus.com/inward/record.url?scp=85177225052&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85177225052&partnerID=8YFLogxK
U2 - 10.1007/978-981-99-7019-3_40
DO - 10.1007/978-981-99-7019-3_40
M3 - Conference contribution
AN - SCOPUS:85177225052
SN - 9789819970186
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 438
EP - 450
BT - PRICAI 2023
A2 - Liu, Fenrong
A2 - Sadanandan, Arun Anand
A2 - Pham, Duc Nghia
A2 - Mursanto, Petrus
A2 - Lukose, Dickson
PB - Springer Science and Business Media Deutschland GmbH
T2 - 20th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2023
Y2 - 15 November 2023 through 19 November 2023
ER -