Multiagent partially Observable Markov Decision Process (Multiagent POMDP) is a popular approach for modeling multi-agent systems acting in uncertain domains. An existing approach (Search for Policies In Distributed Envi Ronments, SPIDER) guarantees to obtain an optimal joint plan by exploiting agent interaction structure. Using SPIDER, we can obtain an optimal joint policy for large-scale problems if the interaction among agents is sparse. However, the size of a local policy is still too large to obtain a policy which length is more than 4. To overcome this problem, we extends the SPIDER so that agents can communicate their observation history and action history each other. After communication, agents can start from a new synchronized belief state thus the combinatorial explosion of local policies is avoided. Our experimental results show that we can obtain much longer policies as long as the interval between communications is small.
|Number of pages
|Published - 2008
All Science Journal Classification (ASJC) codes