Convergence analysis of an incremental approach to online inverse reinforcement learning
Convergence analysis of an incremental approach to online inverse reinforcement learning作者机构:School of Computer Science and Technology Zhejiang University Hangzhou 310027 China
出 版 物:《Journal of Zhejiang University-Science C(Computers and Electronics)》 (浙江大学学报C辑(计算机与电子(英文版))
年 卷 期:2011年第12卷第1期
页 面:17-24页
核心收录:
学科分类:0810[工学-信息与通信工程] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 0805[工学-材料科学与工程(可授工学、理学学位)] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Project (No.90820306) supported by the National Natural Science Foundation of China
主 题:Incremental approach Reward recovering Online learning Inverse reinforcement learning Markov decision process
摘 要:Interest in inverse reinforcement learning (IRL) has recently increased,that is,interest in the problem of recovering the reward function underlying a Markov decision process (MDP) given the dynamics of the system and the behavior of an *** paper deals with an incremental approach to online ***,the convergence property of the incremental method for the IRL problem was investigated,and the bounds of both the mistake number during the learning process and regret were provided by using a detailed *** an online algorithm based on incremental error correcting was derived to deal with the IRL *** key idea is to add an increment to the current reward estimate each time an action mismatch *** leads to an estimate that approaches a target optimal *** proposed method was tested in a driving simulation experiment and found to be able to efficiently recover an adequate reward function.