咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Derivative-free reinforcement ... 收藏

Derivative-free reinforcement learning:a review

作     者:Hong QIAN Yang YU Hong QIAN;Yang YU

作者机构:National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjing 210023China 

出 版 物:《Frontiers of Computer Science》 (中国计算机科学前沿(英文版))

年 卷 期:2021年第15卷第6期

页      面:75-93页

核心收录:

学科分类:0810[工学-信息与通信工程] 12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 0808[工学-电气工程] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:This work was supported by the Program A for Outstanding PhD Candidate of Nanjing University,National Science Foundation of China(61876077) Jiangsu Science Foundation(BK20170013) Collaborative Innovation Center of Novel Software Technology and Industrialization 

主  题:reinforcement learning derivative-free optimization neuroevolution reinforcement learning neural architecture search 

摘      要:Reinforcement learning is about learning agent models that make the best sequential decisions in unknown *** an unknown environment,the agent needs to explore the environment while exploiting the collected information,which usually forms a sophisticated problem to ***-free optimization,meanwhile,is capable of solving sophisticated *** commonly uses a sampling-andupdating framework to iteratively improve the solution,where exploration and exploitation are also needed to be well ***,derivative-free optimization deals with a similar core issue as reinforcement learning,and has been introduced in reinforcement learning approaches,under the names of learning classifier systems and neuroevolution/evolutionary reinforcement *** such methods have been developed for decades,recently,derivative-free reinforcement learning exhibits attracting increasing ***,recent survey on this topic is still *** this article,we summarize methods of derivative-free reinforcement learning to date,and organize the methods in aspects including parameter updating,model selection,exploration,and parallel/distributed ***,we discuss some current limitations and possible future directions,hoping that this article could bring more attentions to this topic and serve as a catalyst for developing novel and efficient approaches.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分