咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Markov decision processes asso... 收藏

Markov decision processes associated with two threshold probability criteria

Markov decision processes associated with two threshold probability criteria

作     者:Masahiko SAKAGUCHI Yoshio OHTSUBO 

作者机构:Department of MathematicsFaculty of ScienceKochi University 

出 版 物:《控制理论与应用(英文版)》 (Journal of Control Theory and Applications)

年 卷 期:2013年第11卷第4期

页      面:548-557页

核心收录:

学科分类:02[经济学] 0202[经济学-应用经济学] 020208[经济学-统计学] 0808[工学-电气工程] 07[理学] 0802[工学-机械工程] 0835[工学-软件工程] 0714[理学-统计学(可授理学、经济学学位)] 070103[理学-概率论与数理统计] 0811[工学-控制科学与工程] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Markov decision process Minimizing risk model Threshold probability Policy space iteration 

摘      要:This paper deals with Markov decision processes with a target set for nonpositive rewards. Two types of threshold probability criteria are discussed. The first criterion is a probability that a total reward is not greater than a given initial threshold value, and the second is a probability that the total reward is less than it. Our first (resp. second) optimizing problem is to minimize the first (resp. second) threshold probability. These problems suggest that the threshold value is a permissible level of the total reward to reach a goal (the target set), that is, we would reach this set over the level, if possible. For the both problems, we show that 1) the optimal threshold probability is a unique solution to an optimality equation, 2) there exists an optimal deterministic stationary policy, and 3) a value iteration and a policy space iteration are given. In addition, we prove that the first (resp. second) optimal threshold probability is a monotone increasing and right (resp. left) continuous function of the initial threshold value and propose a method to obtain an optimal policy and the optimal threshold probability in the first problem by using them in the second problem.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分