TY - JOUR
T1 - A policy representation using weighted multiple normal distribution real-time reinforcement learning feasible for varying optimal actions
AU - Kimura, Hajime
AU - Aramaki, Takeshi
AU - Kobayashi, Shigenobu
PY - 2003
Y1 - 2003
N2 - In this paper, we challenge to solve a reinforcement learning problem for a 5-linked ring robot within a real-time so that the real-robot can stand up to the trial and error. On this robot, incomplete perception problems are caused from noisy sensors and cheap position-control motor systems. This incomplete perception also causes varying optimum actions with the progress of the learning. To cope with this problem, we adopt an actor-critic method, and we propose a new hierarchical policy representation scheme, that consists of discrete action selection on the top level and continuous action selection on the low level of the hierarchy. The proposed hierarchical scheme accelerates learning on continuous action space, and it can pursue the optimum actions varying with the progress of learning on our robotics problem. This paper compares and discusses several learning algorithms through simulations, and demonstrates the proposed method showing application for the real robot.
AB - In this paper, we challenge to solve a reinforcement learning problem for a 5-linked ring robot within a real-time so that the real-robot can stand up to the trial and error. On this robot, incomplete perception problems are caused from noisy sensors and cheap position-control motor systems. This incomplete perception also causes varying optimum actions with the progress of the learning. To cope with this problem, we adopt an actor-critic method, and we propose a new hierarchical policy representation scheme, that consists of discrete action selection on the top level and continuous action selection on the low level of the hierarchy. The proposed hierarchical scheme accelerates learning on continuous action space, and it can pursue the optimum actions varying with the progress of learning on our robotics problem. This paper compares and discusses several learning algorithms through simulations, and demonstrates the proposed method showing application for the real robot.
UR - http://www.scopus.com/inward/record.url?scp=18444390265&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=18444390265&partnerID=8YFLogxK
U2 - 10.1527/tjsai.18.316
DO - 10.1527/tjsai.18.316
M3 - Article
AN - SCOPUS:18444390265
SN - 1346-0714
VL - 18
SP - 316
EP - 324
JO - Transactions of the Japanese Society for Artificial Intelligence
JF - Transactions of the Japanese Society for Artificial Intelligence
IS - 6
ER -