Task-wise attention guided part complementary learning for few-shot image classification
Task-wise attention guided part complementary learning for few-shot image classification作者机构:Research and Development Institute of Northwestern Polytechnical University in Shenzhen School of AutomationNorthwestern Polytechnical University CETC Key Laboratory of Aerospace Information Applications
出 版 物:《Science China(Information Sciences)》 (中国科学:信息科学(英文版))
年 卷 期:2021年第64卷第2期
页 面:44-57页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 081104[工学-模式识别与智能系统] 08[工学] 080203[工学-机械设计及理论] 0835[工学-软件工程] 0802[工学-机械工程] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:supported by Science, Technology and Innovation Commission of Shenzhen Municipality (Grant No. JCYJ20180306171131643) National Natural Science Foundation of China (Grant No. 61772425)
主 题:few-shot learning meta-learning task-wise attention part complementary learning
摘 要:A general framework to tackle the problem of few-shot learning is meta-learning, which aims to train a well-generalized meta-learner(or backbone network) to learn a base-learner for each future task with small training data. Although a lot of work has produced relatively good results, there are still some challenges for few-shot image classification. First, meta-learning is a learning problem over a collection of tasks and the meta-learner is usually shared among all tasks. To achieve image classification of novel classes in different tasks, it is needed to learn a base-learner for each task. Under the circumstances, how to make the base-learner specialized, and thus respond to different inputs in an extremely task-wise manner for different tasks is a big challenge at present. Second, classification network usually inclines to identify local regions from the most discriminative object parts rather than the whole objects for recognition, thereby resulting in incomplete feature representations. To address the first challenge, we propose a task-wise attention(TWA)module to guide the base-learner to extract task-specific image features. To address the second challenge,under the guidance of TWA, we propose a part complementary learning(PCL) module to extract and fuse the features of multiple complementary parts of target objects, and thus we can obtain more specific and complete information. In addition, the proposed TWA module and PCL module can be embedded into a unified network for end-to-end training. Extensive experiments on two commonly-used benchmark datasets and comparison with state-of-the-art methods demonstrate the effectiveness of our proposed method.