咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >3D Human Pose Estimation Using... 收藏

3D Human Pose Estimation Using Two-Stream Architecture with Joint Training

作     者:Jian Kang Wanshu Fan Yijing Li Rui Liu Dongsheng Zhou 

作者机构:National and Local Joint Engineering Laboratory of Computer Aided DesignSchool of Software EngineeringDalian UniversityDalian116622China Dalian Maritime UniversityDalian116023China 

出 版 物:《Computer Modeling in Engineering & Sciences》 (工程与科学中的计算机建模(英文))

年 卷 期:2023年第137卷第10期

页      面:607-629页

核心收录:

学科分类:08[工学] 080203[工学-机械设计及理论] 0802[工学-机械工程] 

基  金:supported by the Key Project of NSFC(Grant No.U1908214) Special Project of Central Government Guiding Local Science and Technology Development(Grant No.2021JH6/10500140) the Program for Innovative Research Team in University of Liaoning Province(LT2020015) the Support Plan for Key Field Innovation Team of Dalian(2021RT06) the Support Plan for Leading Innovation Team of Dalian University(XLJ202010) the Science and Technology Innovation Fund of Dalian(Grant No.2020JJ25CY001) in part by the National Natural Science Foundation of China under Grant 61906032 the FundamentalResearch Funds for the Central Universities under Grant DUT21TD107 

主  题:3D human pose improved TCN GELU kinematic structure 

摘      要:With the advancement of image sensing technology, estimating 3Dhuman pose frommonocular video has becomea hot research topic in computer vision. 3D human pose estimation is an essential prerequisite for subsequentaction analysis and understanding. It empowers a wide spectrum of potential applications in various areas, suchas intelligent transportation, human-computer interaction, and medical rehabilitation. Currently, some methodsfor 3D human pose estimation in monocular video employ temporal convolutional network (TCN) to extractinter-frame feature relationships, but the majority of them suffer from insufficient inter-frame feature relationshipextractions. In this paper, we decompose the 3D joint location regression into the bone direction and length, wepropose the TCG, a temporal convolutional network incorporating Gaussian error linear units (GELU), to solvebone direction. It enablesmore inter-frame features to be captured andmakes the utmost of the feature relationshipsbetween data. Furthermore, we adopt kinematic structural information to solve bone length enhancing the use ofintra-frame joint features. Finally, we design a loss function for joint training of the bone direction estimationnetwork with the bone length estimation network. The proposed method has extensively experimented on thepublic benchmark dataset Human3.6M. Both quantitative and qualitative experimental results showed that theproposed method can achieve more accurate 3D human pose estimations.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分