TY - JOUR
T1 - Threshold probability of non-terminal type in finite horizon Markov decision processes
AU - Kira, Akifumi
AU - Ueno, Takayuki
AU - Fujita, Toshiharu
N1 - Funding Information:
The authors wish to thank Professor Hidefumi Kawasaki for his valuable advice regarding this investigation. We are also grateful to Professor Seiichi Iwamoto for his constant support. He introduced us to the study of the theory of dynamic programming. This research was supported in part by a Grant-in-aid for JSPS Fellows. We would like to thank the anonymous reviewer for useful comments and suggestions.
PY - 2012/2/1
Y1 - 2012/2/1
N2 - We consider a class of problems concerned with maximizing probabilities, given stage-wise targets, which generalizes the standard threshold probability problem in Markov decision processes. The objective function is the probability that, at all stages, the associatively combined accumulation of rewards earned up to that point takes its value in a specified stage-wise interval. It is shown that this class reduces to the case of the nonnegative-valued multiplicative criterion through an invariant imbedding technique. We derive a recursive formula for the optimal value function and an effective method for obtaining the optimal policies.
AB - We consider a class of problems concerned with maximizing probabilities, given stage-wise targets, which generalizes the standard threshold probability problem in Markov decision processes. The objective function is the probability that, at all stages, the associatively combined accumulation of rewards earned up to that point takes its value in a specified stage-wise interval. It is shown that this class reduces to the case of the nonnegative-valued multiplicative criterion through an invariant imbedding technique. We derive a recursive formula for the optimal value function and an effective method for obtaining the optimal policies.
UR - http://www.scopus.com/inward/record.url?scp=80052842288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052842288&partnerID=8YFLogxK
U2 - 10.1016/j.jmaa.2011.08.006
DO - 10.1016/j.jmaa.2011.08.006
M3 - Article
AN - SCOPUS:80052842288
SN - 0022-247X
VL - 386
SP - 461
EP - 472
JO - Journal of Mathematical Analysis and Applications
JF - Journal of Mathematical Analysis and Applications
IS - 1
ER -