咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >A Proposal of Adaptive PID Con... 收藏

A Proposal of Adaptive PID Controller Based on Reinforcement Learning

A Proposal of Adaptive PID Controller Based on Reinforcement Learning

作     者:WANG Xue-song CHENG Yu-hu SUN Wei 

作者机构:School of Information and Electrical Engineering China Universi~ of Mining & Technology Xuzhou Jiangsu 221008 China 

出 版 物:《Journal of China University of Mining and Technology》 (中国矿业大学学报(英文版))

年 卷 期:2007年第17卷第1期

页      面:40-44页

核心收录:

学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 0835[工学-软件工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:Projects 0601033B supported by the Science Foundation for Post-doctoral Scientists of Jiangsu Province, 0C4466 and 0C060093 the Scientific and Technological Foundation for Youth of China University of Mining & Technology 

主  题:reinforcement learning Actor-Critic learning adaptive PID control RBF network 

摘      要:Aimed at the lack of self-tuning PID parameters in conventional PID controllers, the structure and learning algorithm of an adaptive PID controller based on reinforcement learning were proposed. Actor-Critic learning was used to tune PID parameters in an adaptive way by taking advantage of the model-free and on-line learning properties of reinforcement learning effectively. In order to reduce the demand of storage space and to improve the learning efficiency, a single RBF neural network was used to approximate the policy function of Actor and the value function of Critic simultaneously. The inputs of RBF network are the system error, as well as the first and the second-order differences of error. The Actor can realize the mapping from the system state to PID parameters, while the Critic evaluates the outputs of the Actor and produces TD error. Based on TD error performance index and gradient descent method, the updating rules of RBF kernel function and network weights were given. Simulation results show that the proposed controller is efficient for complex nonlinear systems and it is perfectly adaptable and strongly robust, which is better than that of a conventional PID controller.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分