咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Audio-visual keyword transform... 收藏

Audio-visual keyword transformer for unconstrained sentence-level keyword spotting

作     者:Yidi Li Jiale Ren Yawei Wang Guoquan Wang Xia Li Hong Liu 

作者机构:Key Laboratory of Machine PerceptionPeking UniversityShenzhen Graduate SchoolShenzhenChina College of Electronics and Information EngineeringSichuan UniversityChengduChina Department of Computer ScienceETH ZurichZurichSwitzerland 

出 版 物:《CAAI Transactions on Intelligence Technology》 (智能技术学报(英文))

年 卷 期:2024年第9卷第1期

页      面:142-152页

核心收录:

学科分类:0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:Science and Technology Plan of Shenzhen,Grant/Award Number:JCYJ20200109140410340 National Natural Science Foundation of China,Grant/Award Number:62073004 

主  题:artificial intelligence multimodal approaches natural language processing neural network speech processing 

摘      要:As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging *** this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video *** authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual *** outputs of audio and visual branches are combined in a decision fusion *** humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified ***,the position of the keyword is localised in the attention map without additional position ***-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy *** code is available at https://***/jialeren/AVKT.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分