TY - GEN
T1 - Competitive physical interaction by reinforcement learning agents using intention estimation
AU - Noda, Hiroki
AU - Nishikawa, Satoshi
AU - Niiyama, Ryuma
AU - Kuniyoshi, Yasuo
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/8/8
Y1 - 2021/8/8
N2 - The physical human-robot interaction (pHRI) research field is expected to contribute to competitive and cooperative human-robot tasks that involve force interactions. However, compared with human-human interactions, current pHRI approaches lack tactical considerations. Current approaches do not estimate intentions from human behavior and do not select policies that are appropriate for the opponent's changing policy. For this reason, we propose a reinforcement learning model that estimates the opponent's changing policy using time-series observations and expresses the agent's policy in a common latent space, referring to descriptions of tactics in open-skill sports. We verify the performance of the reinforcement learning agent using two novel physical and competitive environments, push-hand game and air-hockey. From this, we confirm that the latent space works properly for policy information because each latent variable that represents the machine agent's own policy and that of the opponent affects the behavior of the agent. Two latent variables can clearly express how the agent estimates the opponent's policy and decides its own policy.
AB - The physical human-robot interaction (pHRI) research field is expected to contribute to competitive and cooperative human-robot tasks that involve force interactions. However, compared with human-human interactions, current pHRI approaches lack tactical considerations. Current approaches do not estimate intentions from human behavior and do not select policies that are appropriate for the opponent's changing policy. For this reason, we propose a reinforcement learning model that estimates the opponent's changing policy using time-series observations and expresses the agent's policy in a common latent space, referring to descriptions of tactics in open-skill sports. We verify the performance of the reinforcement learning agent using two novel physical and competitive environments, push-hand game and air-hockey. From this, we confirm that the latent space works properly for policy information because each latent variable that represents the machine agent's own policy and that of the opponent affects the behavior of the agent. Two latent variables can clearly express how the agent estimates the opponent's policy and decides its own policy.
UR - http://www.scopus.com/inward/record.url?scp=85115060714&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115060714&partnerID=8YFLogxK
U2 - 10.1109/RO-MAN50785.2021.9515411
DO - 10.1109/RO-MAN50785.2021.9515411
M3 - Conference contribution
AN - SCOPUS:85115060714
T3 - 2021 30th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2021
SP - 649
EP - 656
BT - 2021 30th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2021
Y2 - 8 August 2021 through 12 August 2021
ER -