咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Feedback LSTM Network Based on... 收藏

Feedback LSTM Network Based on Attention for Image Description Generator

作     者:Zhaowei Qu Bingyu Cao Xiaoru Wang Fu Li Peirong Xu Luhan Zhang 

作者机构:Beijing Key Laboratory of Network System and Network CultureBeijing University of Posts and TelecommunicationsBeijing100876China Department of Electrical and Computer EngineeringPortland States UniversityPortlandOR 97207-0751USA 

出 版 物:《Computers, Materials & Continua》 (计算机、材料和连续体(英文))

年 卷 期:2019年第59卷第5期

页      面:575-589页

核心收录:

学科分类:0831[工学-生物医学工程(可授工学、理学、医学学位)] 0808[工学-电气工程] 0809[工学-电子科学与技术(可授工学、理学学位)] 08[工学] 0805[工学-材料科学与工程(可授工学、理学学位)] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 0801[工学-力学(可授工学、理学学位)] 

基  金:This research study is supported by the National Natural Science Foundation of China(No.61672108) 

主  题:Image description generator feedback LSTM network attention CBAM 

摘      要:Images are complex multimedia data which contain rich semantic *** of current image description generator algorithms only generate plain description,with the lack of distinction between primary and secondary object,leading to insufficient high-level semantic and accuracy under public evaluation *** major issue is the lack of effective network on high-level semantic sentences generation,which contains detailed description for motion and state of the principal *** address the issue,this paper proposes the Attention-based Feedback Long Short-Term Memory Network(AFLN).Based on existing codec framework,there are two independent sub tasks in our method:attention-based feedback LSTM network during decoding and the Convolutional Block Attention Module(CBAM)in the coding ***,we propose an attentionbased network to feedback the features corresponding to the generated word from the previous LSTM decoding *** implement feedback guidance through the related field mapping algorithm,which quantifies the correlation between previous word and latter word,so that the main object can be tracked with highlighted detailed ***,we exploit the attention idea and apply a lightweight and general module called CBAM after the last layer of VGG 16 pretraining network,which can enhance the expression of image coding features by combining channel and spatial dimension attention maps with negligible *** experiments on COCO dataset validate the superiority of our network over the state-of-the-art *** scores and actual effects are *** BLEU 4 score increases from 0.291 to 0.301 while the CIDEr score rising from 0.912 to 0.952.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分