Introducing communication to joint policy search algorithm for networked distributed POMDPs

Makoto Tasaki, Yuichi Yabu, Makoto Yokoo, Pradeep Varakantham, Janusz Marecki, Mihnd Tambe

研究成果: ジャーナルへの寄稿学術誌査読


Multiagent partially Observable Markov Decision Process (Multiagent POMDP) is a popular approach for modeling multi-agent systems acting in uncertain domains. An existing approach (Search for Policies In Distributed Envi Ronments, SPIDER) guarantees to obtain an optimal joint plan by exploiting agent interaction structure. Using SPIDER, we can obtain an optimal joint policy for large-scale problems if the interaction among agents is sparse. However, the size of a local policy is still too large to obtain a policy which length is more than 4. To overcome this problem, we extends the SPIDER so that agents can communicate their observation history and action history each other. After communication, agents can start from a new synchronized belief state thus the combinatorial explosion of local policies is avoided. Our experimental results show that we can obtain much longer policies as long as the interval between communications is small.

ジャーナルComputer Software
出版ステータス出版済み - 2008

!!!All Science Journal Classification (ASJC) codes

  • ソフトウェア


「Introducing communication to joint policy search algorithm for networked distributed POMDPs」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。