检索结果-南通市图书馆

Audio-visual keyword transformer for unconstrained sentence-level keyword spotting

CAAI Transactions on Intelligence Technology 2024年第1期9卷 142-152页

作者： Yidi Li Jiale Ren Yawei Wang Guoquan Wang Xia Li Hong Liu Key Laboratory of Machine Perception Peking UniversityShenzhen Graduate SchoolShenzhenChina College of Electronics and Information Engineering Sichuan UniversityChengduChina Department of Computer Science ETH ZurichZurichSwitzerland

As one of the most effective methods to improve the accuracy and robustness of speech tasks,the audio-visual fusion approach has recently been introduced into the field of Keyword Spotting(KWS).However,existing audio-visual keyword spotting models are limited to detecting isolated words,while keyword spotting for unconstrained speech is still a challenging *** this end,an Audio-Visual Keyword Transformer(AVKT)network is proposed to spot keywords in unconstrained video *** authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual *** outputs of audio and visual branches are combined in a decision fusion *** humans can easily notice whether a keyword appears in a sentence or not,our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified ***,the position of the keyword is localised in the attention map without additional position ***-imental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99%in clean scenes and 85%in extremely noisy *** code is available at https://***/jialeren/AVKT.

关键词： artificial intelligence multimodal approaches natural language processing neural network speech processing

来源：

维普期刊数据库评论

在线全文

维普期刊数据库

学校读者我要写书评

暂无评论

欢迎您,

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

在线全文

请选择保存的检索档案：

请选择收藏分类：

通借通还

欢迎您,

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

在线全文

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：