Multi-agent differential game based cooperative synchronization control using a data-drivenmethod
基于多智能体微分博弈的数据驱动协同一致控制作者机构:School of Automation Science and Electrical EngineeringBeihang UniversityBeijing 100191China Institute of Artificial IntelligenceBeihang UniversityBeijing 100191China
出 版 物:《Frontiers of Information Technology & Electronic Engineering》 (信息与电子工程前沿(英文版))
年 卷 期:2022年第23卷第7期
页 面:1043-1056页
核心收录:
学科分类:08[工学] 0835[工学-软件工程] 0802[工学-机械工程] 080201[工学-机械制造及其自动化]
基 金:Project supported by the Science and Technology Innovation 2030,China(No.2020AAA0108200) the National Natural Science Foundation of China(Nos.61873011,61973013,61922008,and 61803014) the Defense Industrial Technology Development Program,China(No.JCKY2019601C106) the Innovation Zone Project,China(No.18-163-00-TS-001-001-34) the Foundation Strengthening Program Technology Field Fund,China(No.2019-JCJQ-JJ-243) the Fund from the Key Laboratory of Dependable Service Computing in Cyber Physical Society,China(No.CPSDSC202001)
主 题:Multi-agent system Differential game Synchronization control Data-driven Reinforcement learning
摘 要:This paper studies the multi-agent differential game based problem and its application to cooperative synchronization control.A systematized formulation and analysis method for the multi-agent differential game is proposed and a data-driven methodology based on the reinforcement learning(RL)technique is ***,it is pointed out that typical distributed controllers may not necessarily lead to global Nash equilibrium of the differential game in general cases because of the coupling of networked ***,to this end,an alternative local Nash solution is derived by defining the best response concept,while the problem is decomposed into local differential *** off-policy RL algorithm using neighboring interactive data is constructed to update the controller without requiring a system model,while the stability and robustness properties are ***,to further tackle the dilemma,another differential game configuration is investigated based on modified coupling index *** distributed solution can achieve global Nash equilibrium in contrast to the previous case while guaranteeing the *** equivalent parallel RL method is constructed corresponding to this Nash ***,the effectiveness of the learning process and the stability of synchronization control are illustrated in simulation results.