咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >An Online Q-Learning Method fo... 收藏

An Online Q-Learning Method for Linear-Quadratic Nonzero-Sum Stochastic Differential Games with Completely Unknown Dynamics

作     者:ZHANG Bao-Qiang WANG Bing-Chang CAO Ying 

作者机构:School of Control Science and Engineering Shandong University 

出 版 物:《Journal of Systems Science & Complexity》 (系统科学与复杂性学报(英文版))

年 卷 期:2024年第37卷第5期

页      面:1907-1922页

核心收录:

学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 07[理学] 081104[工学-模式识别与智能系统] 08[工学] 070105[理学-运筹学与控制论] 0835[工学-软件工程] 0701[理学-数学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported in part by the National Natural Science Foundation of China under Grant Nos.62122043, 62192753 in part by Natural Science Foundation of Shandong Province for Distinguished Young Scholars under Grant No. ZR2022JQ31 in part by the Innovative Research Groups of the National Natural Science Foundation of China under Grant No. 61821004 

摘      要:In this paper, the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics. For each player, a critic network is used to estimate the Q-function, and an actor network is used to estimate the control input. A model-free online Q-learning algorithm is obtained for solving this kind of problems. It is proved that under some mild conditions the system state and weight estimation errors can be uniformly ultimately bounded. A simulation with five players is given to verify the effectiveness of the algorithm.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分