咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Visual Superordinate Abstracti... 收藏

Visual Superordinate Abstraction for Robust Concept Learning

Visual Superordinate Abstraction for Robust Concept Learning

作     者:Qi Zheng Chao-Yue Wang Dadong Wang Da-Cheng Tao Qi Zheng;Chao-Yue Wang;Dadong Wang;Da-Cheng Tao

作者机构:University of SydneySydney 2008Australia JD Explore AcademyBeijing 100176China DATA61Commonwealth Scientific and Industrial Research OrganisationSydney 2122Australia 

出 版 物:《Machine Intelligence Research》 (机器智能研究(英文版))

年 卷 期:2023年第20卷第1期

页      面:79-91页

核心收录:

学科分类:08[工学] 080203[工学-机械设计及理论] 0802[工学-机械工程] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:supported in part by the Australian Research Council(ARC)(Nos.FL-170100117 DP-180103424 IC-190100031 and LE-200100049). 

主  题:Concept learning visual question answering weakly-supervised learning multi-modal learning curriculum learning 

摘      要:Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are still vulnerable to attribute perturbations and out-of-distribution compositions during inference. We ascribe the bottleneck to a failure to explore the intrinsic semantic hierarchy of visual concepts, e.g., {red, blue,···} ∈“color subspace yet cube ∈“shape. In this paper, we propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces(i.e., visual superordinates). With only natural visual question answering data, our model first acquires the semantic hierarchy from a linguistic view and then explores mutually exclusive visual superordinates under the guidance of linguistic hierarchy. In addition, a quasi-center visual concept clustering and superordinate shortcut learning schemes are proposed to enhance the discrimination and independence of concepts within each visual superordinate. Experiments demonstrate the superiority of the proposed framework under diverse settings, which increases the overall answering accuracy relatively by 7.5% for reasoning with perturbations and 15.6% for compositional generalization tests.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分