咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >A Comparison of PPO, TD3 and S... 收藏

A Comparison of PPO, TD3 and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation

A Comparison of PPO, TD3 and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation

作     者:James W. Mock Suresh S. Muknahallipatna James W. Mock;Suresh S. Muknahallipatna

作者机构:Department of Electrical Engineering and Computer Science University of Wyoming Laramie Wyoming USA 

出 版 物:《Journal of Intelligent Learning Systems and Applications》 (智能学习系统与应用(英文))

年 卷 期:2023年第15卷第1期

页      面:36-56页

学科分类:08[工学] 081101[工学-控制理论与控制工程] 0811[工学-控制科学与工程] 081102[工学-检测技术与自动化装置] 

主  题:Reinforcement Learning Machine Learning Markov Decision Process Domain Randomization 

摘      要:Deep reinforcement learning (deep RL) has the potential to replace classic robotic controllers. State-of-the-art Deep Reinforcement algorithms such as Proximal Policy Optimization, Twin Delayed Deep Deterministic Policy Gradient and Soft Actor-Critic Reinforcement Algorithms, to mention a few, have been investigated for training robots to walk. However, conflicting performance results of these algorithms have been reported in the literature. In this work, we present the performance analysis of the above three state-of-the-art Deep Reinforcement algorithms for a constant velocity walking task on a quadruped. The performance is analyzed by simulating the walking task of a quadruped equipped with a range of sensors present on a physical quadruped robot. Simulations of the three algorithms across a range of sensor inputs and with domain randomization are performed. The strengths and weaknesses of each algorithm for the given task are discussed. We also identify a set of sensors that contribute to the best performance of each Deep Reinforcement algorithm.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分