咨询与建议

限定检索结果

文献类型

  • 17 篇 期刊文献
  • 9 篇 会议

馆藏范围

  • 26 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 24 篇 工学
    • 18 篇 控制科学与工程
    • 8 篇 软件工程
    • 6 篇 计算机科学与技术...
    • 4 篇 机械工程
    • 2 篇 信息与通信工程
    • 1 篇 仪器科学与技术
    • 1 篇 航空宇航科学与技...
  • 15 篇 理学
    • 12 篇 系统科学
    • 6 篇 数学
    • 1 篇 统计学(可授理学、...
  • 6 篇 管理学
    • 5 篇 管理科学与工程(可...
    • 1 篇 工商管理
  • 1 篇 经济学
    • 1 篇 应用经济学

主题

  • 26 篇 policy iteration
  • 9 篇 reinforcement le...
  • 9 篇 adaptive dynamic...
  • 7 篇 optimal control
  • 3 篇 q-learning
  • 3 篇 approximate dyna...
  • 3 篇 markov decision ...
  • 3 篇 neural networks
  • 3 篇 adaptive critic ...
  • 2 篇 off-policy
  • 2 篇 adp
  • 2 篇 nonlinear system...
  • 2 篇 dynamic programm...
  • 1 篇 convertible bond
  • 1 篇 cooperative hami...
  • 1 篇 dynamic graphica...
  • 1 篇 bolt assembly
  • 1 篇 consensus contro...
  • 1 篇 two-level system
  • 1 篇 integral reinfor...

机构

  • 2 篇 school of automa...
  • 2 篇 school of automa...
  • 2 篇 state key labora...
  • 1 篇 school of automa...
  • 1 篇 the arizona stat...
  • 1 篇 the state key la...
  • 1 篇 tsinghua nationa...
  • 1 篇 state key labora...
  • 1 篇 state key labora...
  • 1 篇 uta research ins...
  • 1 篇 institute of sys...
  • 1 篇 school of artifi...
  • 1 篇 department of el...
  • 1 篇 state key labora...
  • 1 篇 school of aerosp...
  • 1 篇 school of contro...
  • 1 篇 information scie...
  • 1 篇 beijing research...
  • 1 篇 transport planni...
  • 1 篇 school of system...

作者

  • 3 篇 zhihong peng
  • 3 篇 xinxing li
  • 2 篇 liu derong
  • 2 篇 wenzhong zha
  • 2 篇 lele xi
  • 2 篇 wei qinglai
  • 2 篇 bo zhao
  • 1 篇 li liang
  • 1 篇 wei dong
  • 1 篇 ranran sun
  • 1 篇 yankai xu
  • 1 篇 kyriakos g.vamvo...
  • 1 篇 guangyu zhu
  • 1 篇 dimitri bertseka...
  • 1 篇 hu han xi hongsh...
  • 1 篇 mohammed i.abouh...
  • 1 篇 chunyan wang
  • 1 篇 ximing sun
  • 1 篇 hui-jun gao
  • 1 篇 xirencao junyuzh...

语言

  • 23 篇 英文
  • 3 篇 中文
检索条件"主题词=Policy iteration"
26 条 记 录,以下是1-10 订阅
排序:
policy iteration for Optimal Control of Discrete-Time Time-Varying Nonlinear Systems
收藏 引用
IEEE/CAA Journal of Automatica Sinica 2023年 第3期10卷 781-791页
作者: Guangyu Zhu Xiaolu Li Ranran Sun Yiyuan Yang Peng Zhang Beijing Research Center of Urban Traffic Information Sensing and Service Technologies Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport Beijing Jiaotong UniversityBeijing 100044China Transport Planning and Research Institute Ministry of TransportChinaBeijing 100028China
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems,in this paper,a new iterative adaptive dynamic programming algorithm,which is the discrete-time time-varying policy iterati... 详细信息
来源: 维普期刊数据库 维普期刊数据库 同方期刊数据库 同方期刊数据库 评论
policy iteration based Q-learning for linear nonzero-sum quadratic differential games
收藏 引用
Science China(Information Sciences) 2019年 第5期62卷 195-213页
作者: Xinxing LI Zhihong PENG Li LIANG Wenzhong ZHA School of Automation Beijing Institute of Technology State Key Laboratory of Intelligent Control and Decision of Complex System Information Science Academy China Electronics Technology Group Corporation
In this paper, a policy iteration-based Q-learning algorithm is proposed to solve infinite horizon linear nonzero-sum quadratic differential games with completely unknown dynamics. The Q-learning algorithm, which empl... 详细信息
来源: 同方期刊数据库 同方期刊数据库 评论
A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
收藏 引用
Science China(Information Sciences) 2015年 第12期58卷 147-161页
作者: WEI QingLai LIU DeRong State Key Laboratory of Management and Control for Complex Systems Institute of AutomationChinese Academy of Sciences School of Automation and Electrical Engineering University of Science and Technology Beijing
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear sy... 详细信息
来源: 同方期刊数据库 同方期刊数据库 评论
Multiagent Reinforcement Learning:Rollout and policy iteration
收藏 引用
IEEE/CAA Journal of Automatica Sinica 2021年 第2期8卷 249-272页
作者: Dimitri Bertsekas the Arizona State University(ASU) TempeAZ 85281 USAand also with Massachusetts Institute of Technology(MIT)CambridgeMA 02139
We discuss the solution of complex multistage decision problems using methods that are based on the idea of policy iteration(PI),i.e.,start from some base policy and generate an improved *** is the simplest method of ... 详细信息
来源: 维普期刊数据库 维普期刊数据库 同方期刊数据库 同方期刊数据库 评论
A policy iteration method for improving robot assembly trajectory efficiency
收藏 引用
Chinese Journal of Aeronautics 2023年 第3期36卷 436-448页
作者: Qi ZHANG Zongwu XIE Baoshi CAO Yang LIU State Key Laboratory of Robotics and System Harbin Institute of TechnologyHarbin 150001China
Bolt assembly by robots is a vital and difficult task for replacing astronauts in extravehicular activities(EVA),but the trajectory efficiency still needs to be improved during the wrench insertion into hex hole of **... 详细信息
来源: 维普期刊数据库 维普期刊数据库 同方期刊数据库 同方期刊数据库 评论
policy iteration Approach to Average Optimal Control Problems for Boolean Control Networks
Policy Iteration Approach to Average Optimal Control Problem...
收藏 引用
第36届中国控制会议
作者: Yuhu Wu Ximing Sun Wei Wang Tielong Shen School of Control Science and Engineering Dalian University of Technology Department of Mechanical Engineering Sophia University
This paper investigates the average infinite horizon optimal control problem for Boolean control networks(BCNs).Based on the semi-tensor product of matrices and Jordan decomposition technique,an optimality equation ... 详细信息
来源: cnki会议 评论
Approximate policy iteration:a survey and somenew methods
收藏 引用
控制理论与应用(英文版) 2011年 第3期9卷 310-335页
作者: Dimitri P.BERTSEKAS Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology
We consider the classical policy iteration method of dynamic programming(DP),where approximations and simulation are used to deal with the curse of *** survey a number of issues:convergence and rate of convergence of ... 详细信息
来源: 维普期刊数据库 维普期刊数据库 同方期刊数据库 同方期刊数据库 评论
Optimal Tracking Control for Reconfigurable Manipulators Based on Critic-only policy iteration Algorithm
Optimal Tracking Control for Reconfigurable Manipulators Bas...
收藏 引用
第36届中国控制会议
作者: Hongbing Xia Bo Zhao Yuanchun Li Department of Control Science and Engineering Changchun University of Technology The State Key Laboratory of Management and Control for Complex Systems Institute of AutomationChinese Academy of Sciences
This paper tackles the optimal tracking control problem for reconfigurable manipulators based on critic-only policy iteration(Co PI) algorithm. By system transformation, the optimal tracking control problem is trans... 详细信息
来源: cnki会议 评论
A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
收藏 引用
Science China Chemistry 2015年 第12期58卷 143-157页
作者: WEI QingLai LIU DeRong State Key Laboratory of Management and Control for Complex Systems Institute of AutomationChinese Academy of Sciences School of Automation and Electrical Engineering University of Science and Technology Beijing
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear sy... 详细信息
来源: 维普期刊数据库 维普期刊数据库 评论
Adaptive Optimal Control of Space Tether System for Payload Capture via policy iteration
收藏 引用
Transactions of Nanjing University of Aeronautics and Astronautics 2021年 第4期38卷 560-570页
作者: FENG Yiting ZHANG Ming GUO Wenhao WANG Changqing School of Automation Northwestern Polytechnical UniversityXi’an 710129P.R.China Beijing Institute of Aerospace Systems Engineering Beijing 100076P.R.China
The libration control problem of space tether system(STS)for post-capture of payload is *** process of payload capture will cause tether swing and deviation from the nominal position,resulting in the failure of captur... 详细信息
来源: 维普期刊数据库 维普期刊数据库 同方期刊数据库 同方期刊数据库 评论