咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >A stochastic policy search mod... 收藏

A stochastic policy search model for matching behavior

A stochastic policy search model for matching behavior

作     者:CHENG ZhenBo 1,2,ZHANG Yu 1 & DENG ZhiDong 1 1 State Key Laboratory on Intelligent Technology and Systems,Tsinghua National Laboratory for Information Science and Technology,Department of Computer Science,Tsinghua University,Beijing 100084,China 2 Department of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310014,China 

作者机构:State Key Laboratory on Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Science Tsinghua University Beijing China Department of Computer Science and Technology Zhejiang University of Technology Hangzhou China 

出 版 物:《Science China(Information Sciences)》 (中国科学:信息科学(英文版))

年 卷 期:2011年第54卷第7期

页      面:1430-1443页

核心收录:

学科分类:0810[工学-信息与通信工程] 0808[工学-电气工程] 081203[工学-计算机应用技术] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported by the National Natural Science Foundation of China (Grant Nos.61005085 60775040 90820305) 

主  题:policy model matching law reinforcement learning decision-making model 

摘      要:The matching law is one of the basic empirical laws in decision theory,and it states that a subject’s preference to optional targets depends on which choices are *** this paper,we study the possible mechanisms that explain why subjects’ decisions often obey this *** the basis of reinforcement learning theory,we put forward a decision-making model in which the policy is updated by a policy parameter,and the model might be implemented in the brain through the prefrontal cortex and the basal ganglia neural *** on this model,an algorithm that satisfies the matching law is derived under some simple *** analysis and simulation results show that the decision behavior achieved by the algorithm obeys the matching *** addition,the matching behaviors in two classical experiments are reproduced using the *** results provide a reasonable strategy for the matching law and a useful computational tool for rewarded decision-making tasks.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分