Obstacle Avoidance in Multi-Agent Formation Process Based on Deep Reinforcement Learning
作者机构:Guangxi Key Laboratory of Auto Parts and Vehicle TechnologySchool of Electrical and Information EngineeringGuangxi University of Science and TechnologyLiuzhou 545006GuangxiChina Technology Center of Dongfeng Liuzhou Automobile Co.Ltd.Liuzhou 545000GuangxiChina
出 版 物:《Journal of Shanghai Jiaotong university(Science)》 (上海交通大学学报(英文版))
年 卷 期:2021年第26卷第5期
页 面:680-685页
核心收录:
学科分类:0711[理学-系统科学] 07[理学] 08[工学] 070105[理学-运筹学与控制论] 081101[工学-控制理论与控制工程] 071101[理学-系统理论] 0811[工学-控制科学与工程] 0701[理学-数学]
基 金:the National Natural Science Foun-dation of China(No.61963006) the Nat-ural Science Foundation of Guangxi Province(Nos.2020GXNSFDA238011,2018GXNSFAA050029,and 2018GXNSFAA294085)
主 题:wheelbarrow multi-agent deep reinforcement learning(DRL) formation obstacle avoidance
摘 要:To solve the problems of difficult control law design,poor portability,and poor stability of traditional multi-agent formation obstacle avoidance algorithms,a multi-agent formation obstacle avoidance method based on deep reinforcement learning(DRL)is *** method combines the perception ability of convolutional neural networks(CNNs)with the decision-making ability of reinforcement learning in a general form and realizes direct output control from the visual perception input of the environment to the action through an end-to-end learning *** multi-agent system(MAS)model of the follow-leader formation method was designed with the wheelbarrow as the control *** improved deep Q netwrok(DQN)algorithm(we improved its discount factor and learning efficiency and designed a reward value function that considers the distance relationship between the agent and the obstacle and the coordination factor between the multi-agents)was designed to achieve obstacle avoidance and collision avoidance in the process of multi-agent formation into the desired *** simulation results show that the proposed method achieves the expected goal of multi-agent formation obstacle avoidance and has stronger portability compared with the traditional algorithm.